The Rise of Investable Indexes in Modern Investing

November 1, 2023

We explore the popularity of investable indexes and the use of NLP and Machine Learning in building successful index products.

Investable indexes have become a major force in the financial industry, attracting substantial investments from both retail and institutional investors. We delve into the reasons behind the growing popularity of investable indexes and explore how they are constructed, the current trends shaping the landscape, and the integration of alternative data, particularly driven by Natural Language Processing (NLP) and Machine Learning, in building successful index products.

The Passive vs. Active Debate

The financial world has witnessed an intense debate between active and passive investment managers. Critics argue that passive indexing merely replicates benchmark performance, making it a less attractive investment strategy. However, the data paints a different picture. According to the U.S. Year-End 2022 report, only 11% of U.S. active portfolio managers outperformed their benchmark over the past 10 years. This stark contrast has contributed to the increasing popularity of investable indexes.

The Cost-Savings Advantage

Another compelling reason for the rise of investable indexes is the cost-saving benefit they offer. The same S&P Dow Jones report revealed that passive investing has saved investors nearly $300 billion in cumulative management fees over the last two decades. This substantial reduction in expenses has further incentivized investors to opt for passive index-based strategies.

Constructing Investable Indexes

Investable indexes are meticulously constructed, adhering to several key principles:

Rules-Based Approach

These indexes follow a predefined set of rules outlined in an index's rulebook, ensuring transparency and consistency.

Systematic Methodology

Investable indexes are built systematically, relying on data-driven processes rather than subjective decision-making.


These indexes provide clear insights into their composition and methodology, allowing investors to make informed decisions.


The components of these indexes are chosen to be easily accessible and tradable by investors.

To achieve these objectives, investable indexes require a calculation agent and an index administrator to oversee their maintenance and operation.

Several trends are shaping the landscape and driving innovation in investable indexes:

In addition to these trends, the passive indexing landscape is also being disrupted by the use of Alternative Data

Incorporating alternative data into passive indexing has become a critical source of product differentiation in a competitive market. Structured product specialists seek unique datasets to underpin indexes, aiming for alpha generation and captivating storytelling.

Traditional sources of alpha are becoming overused, complicating alpha discovery with increasing speed and complexity. Hence, the growing need for alternative data. Data points such as news sentiment, credit card or transaction data, website traffic data, job data, satellite imagery, and other unconventional financial indicators are becoming invaluable for constructing innovative indexes.

quant fundamental convergence data flow
The data explosion is reshaping asset pricing, forcing both active and passive investors to adapt. Active managers need to consume more data and trade more frequently to maintain alpha, while passive investors need to incorporate a more diverse set of data sources.

The Role of NLP in Successful Index Products

NLP is a Language AI technique that enables the collection, analysis, and categorization of textual data, at scale. Amid this shift towards alternative data, the role of NLP has emerged as a pivotal factor in improving the accuracy, efficiency, and transparency of index products.

Here are some practical applications of NLP within index products:

  • Sentiment analysis of news articles and social media about companies, sectors, and other factors that are relevant to index construction. This data can then guide decisions on assigning weights to index constituents or excluding them;
  • Named Entity recognition involves the identification and ongoing monitoring of entities such as companies, individuals, or products. This process is used to establish filtering rules that eliminate irrelevant data. Additionally, it can assist in pinpointing emerging companies or sectors that should be considered for inclusion in an index;
  • Topic modeling goes beyond mere keyword recognition to identify the underlying topics and themes across a single document or a corpus of documents. This information can then be used to identify emerging themes and create new indexes that track specific topics.
Some concrete examples

Investable Indexes leveraging NLP

The Credit Suisse RavenPack AI Sentiment Index

Powered by news sentiment, the Credit Suisse RavenPack AI Sentiment Index is based on a simple concept - create a sector rotation strategy across US large caps that will rely on earnings news sentiment data, which has been proven to have predictive power over long periods of time.

MSCI Future Mobility Index

NLP informs the MSCI Future Mobility Index by scraping data from diverse sources, including news, social media, and financial reports. It dissects sentiment, offering insights into market reactions. Named Entity Recognition identifies key players, while topic modeling unveils overarching themes. This approach ensures companies' roles in future mobility technologies are accurately categorized and weighted, reflecting their influence in this dynamic sector.

The J.P. Morgan QUEST Cloud Computing Index

The J.P. Morgan QUEST Cloud Computing Index utilizes RavenPack’s NLP technology to construct its portfolio of companies associated with the cloud computing industry. The Index aims to offer exposure to companies strongly connected to the cloud computing sector based on news coverage prominence and recency, with weights determined using an optimization model.

Investable indexes have transformed the investment landscape by offering cost-effective, transparent, and innovative strategies for both retail and institutional investors. As the financial industry continues to evolve, the integration of alternative data and NLP-driven insights is likely to further fuel growth and differentiation in investable indexes.

By providing your personal information and submitting your details, you acknowledge that you have read, understood, and agreed to our Privacy Statement and you accept our Terms and Conditions. We will handle your personal information in compliance with our Privacy Statement. You can exercise your rights of access, rectification, erasure, restriction of processing, data portability, and objection by emailing us at in accordance with the GDPRs. You also are agreeing to receive occasional updates and communications from RavenPack about resources, events, products, or services that may be of interest to you.