Predicting a Historic Election with News Analysis

November 09, 2020

Predicting a Historic Election with News Analysis. How well did our monitor forecast the 2020 presidential election?

Max Colas, Managing Director at RavenPack, offers his take on the media’s impact on the 2020 US presidential election, where he sees RavenPack’s news sentiment proving to be a strong, effective, and reliable predictor.

Of the two primary colors, blue came out on top in 2020 as Joe Biden won the presidential election by 306 electoral votes to Trump’s 229. The result is very close to RavenPack’s forecast of a Biden victory by a mean of 313 votes to 225 for Trump and places our prediction within 3% of the expected nationwide result.

Over 6 months ago, when we set out to build a public Election Media Monitor, we wanted to test whether our sentiment models could also prove effective in the realm of politics. Most of our clients leverage our news-based sentiment indicators in the context of quantitative and multi-factor trading, risk management and alpha capture, but from past experience during previous elections, and the Brexit referendum, we had reasons to believe it might also work for elections. As we directed the powerful magnifying glass of our analytical infrastructure to election media coverage, we actually observed some dynamics which make the initiative more insightful than the sheer prediction of the ultimate outcome.

The two pillars of our news analytics approach are sentiment and media attention - or ‘buzz’ - and their performance in this election shows their strength within macro prediction models. When it came to the popular vote, for example, the combination of those indicators successfully identified the ultimate winner for the fifth time in a row. And whilst this predictive power does not address the specific nature of the correlation between the sentiments expressed in the media and those perceived by voters it does confirm the relevance of media sentiment and attention as proxies for voter inclination.

According to our media monitor, Biden’s sentiment consistently remained above Trump’s since March 2020 and it lifted off as the pandemic unleashed in the spring. Trump briefly bounced back over the summer as the country reopened but then as election day approached sentiment converged towards a near-even partisan split. Trump ended up with a 47% sentiment score vs 53% for Biden, and Biden won the race.

COVID invaded the campaigns, but there was no referendum

A trend analysis of COVID’s presence in the news shows that the pandemic remained the most covered topic during the election cycle. While the virus seemed to invade Trump’s coverage more than Biden’s during the first wave, the indicators crossed early June as COVID became more prominent on articles covering Biden than those of the Trump campaign through the summer, and both sides saw a near alignment of indicators starting with the first presidential debate, as COVID crystallized candidates’ outlook.

While COVID remained pervasive in the news, the oft-described referendum on the administration’s response to the pandemic did not happen. In fact, during the months of June and between mid-August and early October, COVID was significantly less associated with Trump coverage than during Spring. Either way, polls seem to have been unmoved by the virus.

Covid news impact on polls

State-level results remain hard to predict

Although the prediction of a Biden victory was accurate and we predicted the majority of state results, our model disagreed with several battleground states including Florida, Georgia, Arizona, Nevada and New Hampshire. Of those, Florida, Arizona, Georgia and Nevada were close results subject to county-level fragmentation. However, the model accurately predicted a Biden win in Michigan and Wisconsin - two states flipped by Trump in 2016.

Interestingly, the electoral math ends up with a very close result in terms of electoral votes despite reaching it through a different basket of states. This discrepancy epitomizes the framework within which sophisticated analytics prove useful. Our indicators provide insights from a data lake whose borders do not exactly match the constraints of the physical world: the district map of each state, the rigidity of the electoral college system, nationwide availability of online news sources, overlapping media markets and other external factors may erode the local accuracy of the results, yet collectively, particularly in a two-person race in a highly polarized political landscape, those errors seemed to balance one another across the nation.

Considered within a suitable data belief system as one input of a multi-factor prediction model, RavenPack’s media sentiment comes out of this electoral cycle strong, effective, and reliable.

Our Election media monitor dataset has a lot more to offer

As the election wraps up and the Biden camp proceeds towards establishing a transition team, political and academic researchers are poised to begin unpacking what will have been a unique presidential campaign, shadowed by the circumstances and consequences of a global pandemic, heated by unprecedented partisan rhetoric, blurred by fake news and fringe theories, stained by violences in distributed clusters, challenged by anger and the nationwide momentum of Black Lives Matter, and rendered perilously consequential by the threat of climate change. With so many factors influencing the outcome, researchers will welcome the granularity of the RavenPack Election Monitor dataset to disentangle the timeline of events and cast a light on the circumstances of the past several months. As with our COVID-19 dashboard -- which has already underpinned a substantial body of research -- RavenPack will continue to work with academic researchers, and field specialists to draw actionable learnings from our datasets across finance, corporate risk management, political and social sciences, and even public health officials. If you would like to leverage our sentiment analysis data of the US elections, contact us .

Anger sentiment impact on projections

The Making-of the Election Monitor

The 2020 RavenPack Election Media Monitor was underpinned by an in-house research model which defines an alternative state-level forecasting approach using sentiment analysis and media attention to forecast U.S. election results. The model leverages past ballot results corrected with indicators that take into account news sentiment, incumbency status, media attention and probabilistic variables, leading to projected distributions of electoral votes. If you’d like to know more, download our quantitative research paper .

By providing your personal information and submitting your details, you acknowledge that you have read, understood, and agreed to our Privacy Statement and you accept our Terms and Conditions. We will handle your personal information in compliance with our Privacy Statement. You can exercise your rights of access, rectification, erasure, restriction of processing, data portability, and objection by emailing us at in accordance with the GDPRs. You also are agreeing to receive occasional updates and communications from RavenPack about resources, events, products, or services that may be of interest to you.

Data Insights

Read More