Research Blog Events
About Careers Newsroom Get Started

The Big Data & Machine Learning Revolution: Event takeaways, slides & videos

Due to popular demand, RavenPack recently brought its Annual Research Symposium back to London, titled “The Big Data & Machine Learning Revolution”. Here are my personal takeaways from the event. I also linked to each individual presentation.

More than 600 finance professionals registered to attend the London Revolution. An excellent group of top finance professionals shared their latest research and experience with big data and machine learning. The event took place on April 24, 2018 at the Banking Hall, one of the most exquisite venues in Central London. In case you weren't able to attend, presentation slides and video recordings have now been made available.

The agenda included speakers from WorldQuant, Jefferies, & other top firms. They covered the most promising areas of machine learning and AI in finance, what’s real and what’s hype, as well as many real world examples of how alternative data is used in the investment process. Several journalists also attended the Symposium. I would recommend that you take a look at MarketBrains’ interesting follow-up article: “What the buy-side needs to know about the data explosion”.

It was a full day event with several standalone speaking sessions, a few lightning talks, a machine learning panel, and a panel on alternative data.

Big Data Machine Learning Finance

My Takeaways

The event started off with our own Armando Gonzalez, RavenPack CEO, going through the concepts of Big Data and Machine Learning in finance, and arguing why we are experiencing a Big Data and Machine Learning Revolution. Armando highlighted four areas that are being disrupted by artificial intelligence (AI) and machine learning, including access to unstructured content, the data modelling/cleaning process, the generation of alpha signals (predictive analytics), as well as portfolio optimization.

Following Armando’s opening address, Nitish Maini, General Manager of WorldQuant’s Virtual Research Center, took the stage. He started off an interesting day by highlighting the similarities and differences between quantitative and discretionary investing with an emphasis on the importance of data. He also provided a practical example of how a quant might build an alpha signal using WoldQuant’s own Websim simulator. This was the first of many practical examples during the day.

Next, on behalf of Ravenpack, I had a chance to present some of the recent research done by the RavenPack Data Science team. In particular, I showcased how news sentiment can be a valuable input to a global portfolio, with positive performance in 42 of 47 countries. A set of Long/Short strategies were created, combining country level signals into both regional and global portfolios evaluated at different holding periods.

Manoj Saxena, Chairman, CognitiveScale followed up with an engaging talk about the potential of Artificial Intelligence and Machine Learning, recognizing the extreme hype surrounding AI in the current market place. Conclusion: robots will neither kill us (fears driven by Hollywood), nor take away our jobs (fears driven by the PR machine from major corporations). Instead, the big opportunity with AI is augmentation, not automation. It will change how we work. He argued, it will become the biggest game changer in finance since the introduction of spreadsheets. AI can help us reason across unstructured data like news or images, something we haven’t been able to do previously. Manoj also argued that you need to get into the AI space now, otherwise, you will exponentially get left behind. AI systems learn exponentially, while rules-based systems improve linearly...all very valid and thought provoking points!

Asger Lunde, Director, Copenhagen Economics and Professor of Economics, Aarhus University explored how to use news sentiment indexes for Chinese macroeconomic time series forecasting and found an improvement on shorter term prediction horizons when including his news indexes. Addressing a question from the audience, he argued that it was too early to say if news could completely replace standard macroeconomic variables. However, it is definitely worth taking news into consideration when doing macroeconomic forecasts.

Dimitri Huwyler, Head of Quantitative Strategy and Aleksandar Pramov, Quantitative Researcher, Next Gate Capital shared their session talking about the use of machine intelligence and alternative data for market timing - covering a practical example of enhancement of a trend following strategy. In particular, they used classic variables to build economic climate and global sentiment indicators, enhanced with news sentiments, particularly on politics and monetary policy (two fields very difficult to handle with classic datasets). Based on their research, they concluded that alternative data doesn’t have to be used only to drive entirely new alphas, but can also be used to enhance established alpha signals.

We then had the first panel of the day, focusing on “the State of Machine Intelligence in Capital Markets”. The moderator, Roland Fejfar, Executive Director, Morgan Stanley Fintech IB Division did an excellent job at engaging the audience as well as our panelists in a lively discussion. Mark Salmon, Professor, Cambridge University, John "Morgan" Slade, CEO, CloudQuant, and Andrej Rusakov, Co-founding Partner, Data Capital Management were all part of the panel. They covered topics such as correlation vs. causation, with causal prediction being one of the hot topics in academic research at the moment. They also talked about the challenges of convincing traditional fund managers to apply new machine learning techniques due to the lack of track record, interpretability, or even credibility. The panel agreed that machine intelligence in finance is still in its early stages and that other sectors are likely several years ahead. However, knowledge transfer into finance from other domains isn’t necessarily that easy due to the complexity of the capital markets. When talking about the future of AI, the panel agreed that in 3-5 years we might be using completely new techniques that we can’t even envisage today, or we might use existing techniques in complete new ways...We surely live in exciting times.

Big Data Machine Learning Finance

Ada Lau, Quantitative Strategist at J.P. Morgan Securities (Asia Pacific) went on to demonstrate how big data and Machine Learning (ML) can add value to systematic strategies. In particular, she considered the examples of an equity mean-reversion strategy in Japan (> white paper), and a ML-driven global value strategy ( > white paper). She showcased how using news volume and sentiment, she was able to improve both her strategies. In fact, she highlighted news and social media sentiment as being amongst the most promising alternative datasets available in the marketplace.

Other interesting talks were given by:

Andrej Rusakov, Co-founding Partner at Data Capital Management, who shared some best practices and real-life examples of Machine Learning application to investing.

Mark Salmon continued the discussion from the earlier panel by discussing the next steps of machine learning in finance. In particular, he focused on the recent academic literature that attempts to ensure causation rather than correlation in the use of machine learning.

Richard Bateson, Director of Bateson Asset Management focused on combining alternative news and sentiment data with traditional signals and showed how it can provide increased risk-adjusted returns in long/short equity portfolios.

Louis Scott, Founder of Kiema Advisors, tried to answer the question “Do factors perform differently under news driven sentiment?”. Results showed strong difference in the underlying distribution of factor returns under periods of high and low sentiment regimes - something which was particularly evident when evaluating Sortino ratios.

Big Data Machine Learning Finance

Michael Mayhew, Principal at Integrity Research, provided the first lightning talk of the day highlighted a number of lessons for vendors and users of alternative data about the risks of selling or using data that could provide personally identifiable information.

Our own Jason Cornez, Chief Technology Officer at RavenPack, took the stage talking about how RavenPack is able to automatically detect thousands of different types of market moving events in unstructured text documents. He then went on to highlight what innovations are currently happening at RavenPack to help enrich the events the system can detect moving forward.

As an introduction to the final panel of the day, Dan Furstenberg, Head of Data Strategy at Jefferies discussed alternative data integration on the buy-side - particularly highlighting how nearly 50 fundamental investors are actively building these efforts and provided insights into how they are approaching alternative data from a talent, infrastructure and resourcing perspective. Interesting lessons if you are thinking about building out your own efforts within data science.

We finished off an exciting day with an enlightening panel on alternative data; “how to avoid the alternative data pitfalls”, moderated by Dan Furstenberg from Jefferies. Panelists included Leigh Drogen, CEO, Estimize, Rich Brown, Managing Director, Schonfeld Strategic Techworx, Michael Mayhew, Principal, Integrity Research, and myself. Panelists were asked to share their opinions on useful or disappointing datasets that they have come across in their career. Geolocation data was one of the datasets being discussed. Even though this data was considered promising from an alpha perspective, it suffers from serious compliance risk. This makes the data hard to monetize on in practice. Other datasets that were discussed included social media sentiment, and satellite imagery. Take a look at the video to hear people’s view on these datasets. The final comment of the panel concluded with: “Okay, cocktails it is...”.

Slides & Videos

To ensure that no one misses out, i’m adding the link to slides and videos of the event here.

Discover the RavenPack Platform

We offer more than 90 dashboards, built with over 160 curated datasets and powerful visualisation tools.