Peter Hafez - Chief Data Scientist - RavenPack
| October 26, 2016
A wealth of big data analytics that captures the overall media sentiment in the US before the elections. Read the analysis.
The 2016 US Presidential campaigns have degenerated to the level of a daytime talk-show, and the media is playing a big role in influencing the remaining voters. With just under two weeks to go, we’ve been able to generate a wealth of big data analytics on both Hillary Clinton and Donald Trump that captures the overall media sentiment in the US. With this, we can survey the lay of the election land and offer some analysis on the impact of media coverage for each candidate in the run-up to the November 8th elections.
As it stands at the moment, we see a very strong correlation between the media sentiment and poll numbers when comparing each candidate with each other. As already widely reported, the data confirms that the release of a damaging video of Mr. Trump making lewd comments in early October was a clear turning point for his campaign. His subsequent accusations that the election will be rigged are not working in his favor either. Mr. Trump’s media sentiment continues to fall with recent polls, translating into better polling numbers for Mrs. Clinton with just over two weeks until the election.
With regard to our data and methodology for this piece, we used RavenPack’s next-generation analytics platform which is currently in final phases of testing. The platform is now tracking more than a quarter of a million different entities including companies, people, places, organizations, products, commodities, and more. In addition, we’ve added thousands of new event categories that will help our systems better classify the vast quantities of news that we process and therefore, help our users make better use of RavenPack’s big data analytics.
Our enhanced people detection capability plays a key role in this analysis, and we’ve chosen to focus simply on the two key candidates in the first installment of our US elections analysis. We’ve calculated the average sentiment levels on a daily basis for each candidate across various categories, including everything from political endorsements to corruption allegations, to see how they are faring in the media. Once we have our sentiment scores, we also look at their
poll numbers as published by the Financial Times
. We then compare these two time series and look for interesting spikes and correlations in order to explain key events so far in this election campaign and show how they have impacted the polls.
Let’s start by looking at the polling trends since July when it became clear that both candidates were going to secure the nominations for their respective parties. The Financial Times provides a fantastic
which we can use to examine the average of various polls conducted at the national level in the US. It should be noted that, according to the Financial Times’ methodology, the polling data they display is derived from polls published by
Real Clear Politics
. For our analysis, we’ve plotted the 14-day rolling average of not only the poll statistics but also the RavenPack sentiment statistics in order to smooth out short-term variations that don’t play a significant role in the medium-term trends.
Examining the poll numbers, we can see that Mr. Trump made significant headway in the polls in early July once he secured enough delegates to become the GOP’s presumptive nominee. In fact, he continued to gain steadily in the polls throughout August and September, closing the gap to Mrs. Clinton to a mere two percentage points by early October.
This progress was reversed by the release of an
explicit video by the Washington Post
on the 7th of October where Mr. Trump makes demeaning comments about women as well as possible sexual assault claims. We can see the same effect appear in the media sentiment trends in the graphic below where Mr. Trump’s improving media sentiment is halted in early October by the release of the video.
Looking at Hillary Clinton’s poll and sentiment numbers, we can see that she started off with a significant advantage over Donald Trump at the beginning of July only for her lead in both to be whittled away until early October. Perhaps the most astonishing trend we see in the RavenPack results for Mrs. Clinton is the damage done to her media sentiment by the
release of emails from John Podesta by Wikileaks
, her campaign chairman, which place Mrs. Clinton in a bad light across a number of issues.
Mrs. Clinton’s media sentiment fell dramatically in the aftermath of the Podesta emails release, but the fallout from the Trump tape overshadowed those issues and allowed Mrs. Clinton to shake off the negative publicity resulting from the email release. Indeed, Mr. Trump’s media sentiment has continued to fall as the media seizes on his allegations that the election will be ‘rigged’ in Mrs. Clinton’s favor. Putting both media sentiment and poll numbers together in the same graphic, as we do in the next section, really drives this point home.
In order to compare the media sentiment and poll numbers directly, we’re going to analyze the spread between Hillary Clinton and Donald Trump in both series. Effectively, what we do is to subtract Donald Trump’s average daily poll and sentiment numbers from Hillary Clinton’s average daily poll and sentiment numbers. The resulting data series will then be the difference between the two candidates for each metric, and when analyzing the data, positive numbers on either sentiment or poll numbers mean that they’re in Mrs. Clinton’s favor, whereas negative numbers signify that the metric is in Donald Trump’s favor. As with the polling data, we take a 14-day rolling average in order to smooth out short-term variations and better assess the trends in the data.
The first observation that stands out is just how correlated media sentiment spread is to the spread in average of national polls for the two candidates. On the day we published this report, the correlation between the two was 0.83, or in other words, media sentiment spread is highly correlated to the poll spread. Secondly, we can see that both spreads were moving in favor of Donald Trump right up until the first presidential debate that took place on the 26th of September.
The media sentiment spread jumps in favor of Mrs. Clinton following the first debate and subsequent release of the Trump tape, but the sentiment spread quickly reverses and momentum, in the media at least, shifts back in favor of Mr. Trump. The reaction in the polls seems to trail the sentiment movements that we see in late September and early October by a few days. The bounce for Hillary Clinton in the polls in the first couple days of October follows her improved media sentiment in the preceding days, while the media sentiment momentum swings back in favour of Mr. Trump in the middle of October after Wikileaks releases more emails from John Podesta that place Hillary Clinton in an unfavourable light. Indeed, we’ve started to see the poll spread narrow again in the last couple days, but recent positive coverage of Mrs. Clinton suggests that this narrowing in the polls won’t be significant.
As we move closer to election day, we’ll continue to follow the media sentiment trends for both candidates. As things stand now, Hillary Clinton still maintains a lead in the polls and more favorable media sentiment than Donald Trump.
That said, the media sentiment is still shifting, and either candidate is still vulnerable to a ‘news shock’ that could well shift the polls even more. For Trump to move into a position to win on election day, however, we would have to see the media sentiment spread move significantly in his favor. With Wikileaks promising more emails from the Clinton campaign, it is still possible for a media sentiment shift that large to take place. We are now in the situation where each candidate is hanging on, just waiting for the other to fall. Stay tuned for our next update closer to election day which will give more insight into the drama to come.
Please use your business email. If you don't have one, please email us at email@example.com.
We will process your personal data with the purpose of managing your personal account on
RavenPack and offering our services. You can exercise your rights of access, rectification,
erasure, restriction of processing, data portability and objection by emailing us at firstname.lastname@example.org. For more information, you can
Your request has been recorded and a team member will be in touch soon.
High inflation has returned in developed markets after decades of lying low. In our latest paper, we show how to build an inflation-based asset allocation strategy using sentiment data and we illustrate that sentiment-based strategies outperform models that depend merely on past observed inflation values.
This year's RavenPack Research Symposium brought two intense days of knowledge sharing in London and New York, from 25 top experts in natural language processing, quantitative investing and machine learning. Together, we explored how firms can leverage new language models to generate alpha, better manage risk and respond to calls for more sustainable investment practices.
Human capital is at the heart of value creation. Our latest research demonstrates how unprecedented workforce insights, sourced from over 200 million job postings, can generate more alpha.