Expected Returns and Large Language Models
62 Pages Posted: 21 Apr 2023 Last revised: 23 Aug 2024
Date Written: November 22, 2022
Abstract
We leverage state-of-the-art large language models (LLMs) such as ChatGPT and LLaMA to extract contextualized representations of news text for predicting stock returns. Our results show that prices respond slowly to news reports indicative of market inefficiencies and limits-to-arbitrage. Predictions from LLM embeddings significantly improve over leading technical signals (such as past returns) or simpler NLP methods by understanding news text in light of the broader article context. For example, the benefits of LLM-based predictions are especially pronounced in articles where negation or complex narratives are more prominent. We present comprehensive evidence of the predictive power of news on market movements in 16 global equity markets and news articles in 13 languages.
Keywords: natural language processing (NLP), Large Language Models (LLM), BERT, GPT, LLAMA, ChatGPT, Bag-of-Words, Word2vec, machine learning, return prediction
JEL Classification: G10, G11, G14, C14, C11, C21, C22, C23, C58
Suggested Citation: Suggested Citation