Insights and Limitations in Stock Price Prediction Using LLMs

January 15, 2024

Peter Hafez, Chief Data Scientist at RavenPack, analyzes the impact and limitations of LLMs in stock price predictions.

Peter Hafez

Peter Hafez

Chief Data Scientist

RavenPack

Promising results in the realm of stock price predictions using LLMs have been showcased in the now-famous paper Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models". This paper, released earlier this year, has garnered significant attention, becoming one of the Top 40 most downloaded papers of all time on SSRN with over 56,000 downloads. While the paper demonstrates encouraging results in predicting future stock returns, caution is warranted in drawing strong conclusions.

Internal research within RavenPack suggests that the outcomes can be sensitive not only to the version of the GPT model they used but also to the strategy implementation. The robust performance depicted in the paper relies heavily on the assumption of attaining the open-price, a scenario proven impractical in real-world contexts. Even with an alternative 15-minute VWAP implementation, the value diminishes.

The robust performance depicted in the paper relies heavily on the assumption of attaining the open-price, a scenario proven impractical in real-world contexts.

In the figure below, we illustrate the impact of the expanded VWAP on the open-strategy performance applying the same prompt as in the paper, revealing a distinct deterioration evident even with a shift to a 15-minute VWAP. The results also highlight variations in the open-price implementation between the March and June GPT 3.5 Turbo versions. Due to the black-box nature of the models, explaining the shift in performance becomes impractical. Nonetheless, we presume the utilization of the March version in the paper.

@import url('https://fonts.googleapis.com/css2?family=Fira+Sans+Extra+Condensed:wght@400;500;700;800&family=Fira+Sans:ital,wght@0,100;0,200;0,300;0,400;0,500;0,600;0,700;0,800;0,900;1,100;1,200;1,300;1,400;1,500;1,600;1,700;1,800;1,900&display=swap') JANUARY 2022 JULY 2022 JANUARY 2023 JANUARY 2022 JULY 2022 JANUARY 2023 JANUARY 2022 JULY 2022 JANUARY 2023 JANUARY 2022 JULY 2022 JANUARY 2023 0. 0 0. 2 0. 4 0. 6 Cumulative Log−Returns VWAP window 0 MIN 15 MIN 30 MIN 60 MIN gpt−3.5−turbo−0301 gpt−3.5−turbo−0301 gpt−3.5−turbo−0301 gpt−3.5−turbo−0301 gpt−3.5−turbo−0613 gpt−3.5−turbo−0613 gpt−3.5−turbo−0613 gpt−3.5−turbo−0613
Figure 1: Cumulative log-returns for a portfolio of US Top 3000 companies, trading open to open returns, buying stocks with positive sentiment and shorting stocks with negative sentiment according to the GPT 3.5 Turbo model.

It’s important to note that this analysis doesn't diminish the potential value of LLMs in systematic investing. While achieving Sharpe Ratios above 3 may pose challenges, the RavenPack Data Science team remains optimistic about the applications of LLMs in finance based on internal research and we anticipate sharing more of our findings on this topic throughout 2024.



By providing your personal information and submitting your details, you acknowledge that you have read, understood, and agreed to our Privacy Statement and you accept our Terms and Conditions. We will handle your personal information in compliance with our Privacy Statement. You can exercise your rights of access, rectification, erasure, restriction of processing, data portability, and objection by emailing us at privacy@ravenpack.com in accordance with the GDPRs. You also are agreeing to receive occasional updates and communications from RavenPack about resources, events, products, or services that may be of interest to you.