Research Blog Events
About Careers Newsroom Get Started

Factor Returns and Sentiment

View an extract of this session held at the London Big Data and Machine Learning Revolution event in April 2018.You can also access the full video and slides.


Using Style Research and RavenPack sentiment data, Louis constructs regional factors and sentiment indices in the spirit of Hafez and Xie 2016. Results show strong differences under periods of high and low sentiment. The design of a quarterly moving average is distinct from most findings that reveal intra-day to a few days efficacy for sentiment. In particular, a strong difference in underlying distribution of factor returns is revealed and the Sortino ratios are distinct under sentiment regimes.

Full Video access slides

Big Data Machine Learning Finance

Sentiment in Asset Prices?

The Nobel Memorial Prize in Economic Sciences for 2013 was awarded to Eugene Fama, Lars Peter Hansen, and Robert Shiller for their contributions to the empirical study of asset pricing. Factor Returns and Sentiment

The graph. So a well known member of the investment community, a member of the London Quantum group, is very fond of talking about this calling it “The Graph”, and probably the reason why Bob Shiller has a Nobel Prize. And so what you see here is the perfect foresight model of dividends so they're discounted present value is that slowly evolving line. And the other line is the real prices for the market index and you can see just how different and volatility those two series are. It's not just Shillier but other really knowledgeable people to like, John Cochrane that makes that very same point when he gave his article about the two reasons why Eugene Fama and Bob Shiller got their awards.

So what do we attribute all of that variation?

This is actually also interesting cause at that time when this thing came out, most academics were really focused on making sense of Rational Asset Pricing. And this graph was a powerful counter argument in just one image to that. So why is this happening? Well Gene Farma came back and said "well you know this time variation is due to risk aversion which is tied to business cycles. Prices are low in bad times when most people cannot afford to take risk and prices go up in good times". But what Shiller said was actually much more than that, we're not rational investors, we're bounded rational, and when that's true, irrational sentiments are what's driving market participants. So which of these two is it? And that's what I'm going to try to talk about.

What's truly important is to get the story as to why we look at sentiment from the perspective of a longer run investor. What I want you to take away is the argument. So while much of the literature that talks about really short horizon, you know HFT. I think a lot of what's going on the HFT space has been simple arbitrage trades when there isn't any new information and when new information arrives in the form of news, people take those trades off. So it's really valuable to short horizons but how do we make sense of it in their horizons of longer horizon investors?

I want to focus on something very very narrow. What I want to talk about is what I'll call Sentiment Regimes and how they affect the distribution of factor returns. One of the questions we'll ask is if it's actually driven by macro information and business cycles, what Eugene Fama suggested. We also want to know, there's literature going back to Baker and Wiggler and what we want to know is if this is actually distinct from that kind of a signal.

And finally we want to know if we can make money with this and that requires finding a low cost implementation which will rebalance on a quarterly basis.

Were used Style Research data because it allows for a consistent implementation with their long history of factory performance. The data that we're using is just the usual suspects and we credit RavenPack for allowing us to condense all of that information that they convert using their methodology into a terabytes of data going back to the year 2000.

So everything's done in US dollars. We rebalance every year to get the 3,000 largest names globally and we have something like 7,500 firms close to 4,500 days of data going back all that time. The RavenPack dataset is millisecond timestamps with a whole host of identifiers and metadata tagging. For me going back to 2000 across all regions, there was about a terabyte of data and then what I did using Bash, some standard Linux scripts, was to take all that information and bin it down into five minute increments, take account of positive and negative stories. By the way, we did the five minute binning as the Federal Reserve suggests five minutes is the gold standard for doing realized volatility. And to be conservative for the early history, we just assume that we could not implement this any quicker than five minutes.

So at the close of every day, we stopped using that information five minutes before the close and any information there is tacked on the next business day.

North America Sentiment

On the left hand side we have a count of the stories by day and a rolling average of that information. Its very much dominated by the largest names in the US, so there are round 150 to 200 names that account for 70% of the news stories.

Factor Returns and Sentiment

European Regional Sentiment

We aggregate at the country level, take the country data and aggregate it into regions. Europe, Asia and North America. What I’m showing here is the decomposition of the time series of positive minus negative divided by positive plus negative sentiment. In red is a rolling four year average.

Factor Returns and Sentiment

Regional Sentiment, the first as in Hafez and Xie, the latter used in the paper.

On the left is as close to the implementation as I can get from the original paper I can get with three regions. On the right is what I strip out of a four year rolling average.

Factor Returns and Sentiment


Is this business cycle driven or not? When you combine all of those, you get the raw measure.

Factor Returns and Sentiment


Sentiment affects the distribution of returns. This study examined 11 factor returns across 3 regions and four time horizons, 5, 10, 22 and 63 days. 131 out of 132 are significant at the 10% level. All results for Europe and North America are below the 5 in 10,000 level.

Monotonicity Results

Romano and Wolf Monotonicity p/values. North America. Negative sentiment in the left panel, positive (middle) and all the periods (right).

Factor Returns and Sentiment


Factor Returns and Sentiment


Factor Returns and Sentiment

Take Away

  • Sentiment and downside risk.
  • Survives practical implementation.
  • Style Research Factor Returns add value in TAA.
  • RavenPack adds value in reducing big data to easy to access scales.

Full Video access slides

Request Event Materials

Request Event Materials

We will process your personal data with the purpose of managing your personal account on RavenPack and offering our services. You can exercise your rights of access, rectification, erasure, restriction of processing, data portability and objection by emailing us at For more information, you can check out our Privacy Policy.

Discover the RavenPack Platform

Choose from over 180 curated datasets covering a variety of different topics.

Explore Now