Stories are said to be unique in that they discuss a specific company, topic or theme, and reveal new information. In just a few seconds, these news stories are fed into thousands of news outlets, content aggregators, websites, and end-user applications like trading terminals. News is reproduced, redistributed, and duplicated into millions of news items in minutes. As more time goes by (usually a few days), the original stories become an uncontrollable stream of news items found just about anywhere from corporate or proprietary intranets to the Web. So, when does news become noise?
For starters, most original financial news comes from newswires and professional paid news services and not blogs or other social media. News vendors sell their information usually as part of a subscription model and investors pay a regular fee to receive their content.
Accessing content in real-time involves a fee whereas consuming the information online typically has a lower cost (or no cost) because it’s delayed. Investors will pay premium to receive timely content like economic indicators, corporate news and sentiment data in a machine-readable format, as a low-latency feed, sometimes even co-locating servers with the content provider to gain just a few milliseconds on delivery. The faster one can access the content and the more efficient the format is (pre-analyzed for sentiment, relevance, and potential market impact), the higher the premium they are willing to pay.
One of the reasons why investors pay for content is to be closer to the original source and further away from the noise. Having access to direct feeds of original content helps remove noise, but doesn’t solve the problem. Even reputable and original sources may cover the same stories but produce separate unique news items. A challenge is then to sift through the content of original publishers and find the “news” on the first instance when it happens. Here is where the burden is placed more on the format of the stories (rich tags and analytics) than on speed.
The following factors play an important role in eliminating noise:
- Direct Access: Just because it’s not online when you looked doesn’t mean the content isn’t already out there being consumed and traded by investors. Most original breaking news in finance can take minutes to hours before you can read it online (and more time for it to appear on a search engine or news aggregator service like Google news). Don’t be fooled by the “real-time” web argument. Let’s face it, most news online is not real-time.
- News Dissemination: Understand how news and information flows from the original content producer to the end consumer (the investor). Without this fundamental knowledge, as an investor you’re probably trading off old news even if it appears new to you.
- Identify the Source: Distinguish between sources, authors, publishers, aggregators, and even trading terminals (i.e. not everything on the Bloomberg terminal is Bloomberg News). For example, most financial sites like Google Finance, CNN Money, Forbes, etc., get their news from an original publisher (and someone else likely subscribed to a direct feed from the source).
- Novelty: Again, just because it’s news to you (or your news reader or aggregator) doesn’t mean it’s new and unique. A criterion for novelty is essential depending on your trading strategy and horizon (intraday, daily, monthly, quarterly, etc.)
- Relevance: You must be able to describe (preferably in numerical terms) how pertinent, connected, or applicable a news story is to a given entity (i.e. a company). In Fig. 1, I have included a summary describing the distribution of relevance (according to RavenPack) across sentiment-based news stories for the constituents of the Dow Jones Industrial Average covering the years 2005 through 2008. The vast amount of low relevance stories indicates a potential presence of significant levels of noise provided that relevance is not taken into account. Especially, it can be observed that only about 22% of stories hold a high degree of relevance.
- Sentiment: If only it was that easy as “green” for “good, “red” for “bad”. Sentiment is not absolute! Is it positive to one company but negative to another? Under what conditions and within what trading horizon? Are some news events more negative than others? Do you have sentiment data from one perspective (one technique) or multiple (various techniques)? Looking at news from various angles will likely yield a more representative interpretation of sentiment.
Fig. 1: Distribution of Relevance on news stories with sentiment for the Dow 30. Relevance is measured with a score of 0-100 where higher values indicate a story is more relevant to a company in the Dow 30.
There are perhaps other factors to consider when addressing the question of when news becomes noise. David Leinweber does a really good job explaining some of these issues and provides a structured view for the sources of investment news and securities trading rumours in his book “Nerds on Wall Street”. I also read some postings by Jason Goepfert in Sentiment’s Edge where he discusses the importance of accuracy in sentiment analysis and the sources of content used to measure sentiment.
Overall, I think understanding the various aspects of news production and consumption is key to generate alpha from public information.