Cut the Chit-Chat: A New Framework for the Application of Generative Language Models for Portfolio Construction
28 Pages Posted: 31 Dec 2024
Date Written: October 28, 2024
Abstract
Current applications of generative language models (GLMs) for portfolio construction utilize the chat functionality of such models to forecast expected returns. The current literature classify outputs of these models using discrete labels that ignore the magnitude of sentiment. We show that this procedure is not optimal for cross-sectional portfolio construction. In this paper, we introduce Logit Extraction, a methodology for extracting the model's assigned probabilities to sentiment labels, allowing for the formulation of a continuous-valued ranking variable for portfolio construction. We demonstrate that Logit Extraction significantly enhances risk-adjusted returns over previous discrete label approaches. We make available an open source implementation of Logit Extraction in the python package "TokenProbs".
Keywords: large language models, news sentiment, portfolio construction, generative AI
JEL Classification: G11, C45, G17
Suggested Citation: Suggested Citation