Chronologically Consistent Large Language Models

21 Pages Posted: Last revised: 28 Feb 2025

See all articles by Songrun He

Songrun He

Washington University in St. Louis - John M. Olin Business School

Linying Lv

Olin Business School, Washington University in St. Louis; Zhejiang University

Asaf Manela

Washington University in St. Louis - John M. Olin Business School; Reichman University

Jimmy Wu

Washington University in St. Louis - John M. Olin Business School

Date Written: January 15, 2025

Abstract

Large language models are increasingly used in social sciences, but their training data can introduce lookahead bias and training leakage. A good chronologically consistent language model requires efficient use of training data to maintain accuracy despite time-restricted data. Here, we overcome this challenge by training chronologically consistent large language models timestamped with the availability date of their training data, yet accurate enough that their performance is comparable to state-of-the-art open-weight models. Lookahead bias is model and application-specific because even if a chronologically consistent language model has poorer language comprehension, a regression or prediction model applied on top of the language model can compensate. In an asset pricing application, we compare the performance of news-based portfolio strategies that rely on chronologically consistent versus biased language models and estimate a modest lookahead bias.

Keywords: Large language model, chronological consistency, lookahead bias, training leakage, backtesting

JEL Classification: G11, G12, G17

Suggested Citation

He, Songrun and Lv, Linying and Manela, Asaf and Wu, Jimmy, Chronologically Consistent Large Language Models (January 15, 2025). Available at SSRN: https://ssrn.com/abstract=

Songrun He

Washington University in St. Louis - John M. Olin Business School ( email )

One Brookings Drive
Campus Box 1133
St. Louis, MO 63130-4899
United States

Linying Lv

Olin Business School, Washington University in St. Louis ( email )

Zhejiang University ( email )

38 Zheda Road
Hangzhou, 310058
China

Asaf Manela (Contact Author)

Washington University in St. Louis - John M. Olin Business School ( email )

One Brookings Drive
Campus Box 1133
St. Louis, MO 63130-4899
United States
314-935-9178 (Phone)

HOME PAGE: http://asafmanela.github.io/

Reichman University ( email )

Israel

Jimmy Wu

Washington University in St. Louis - John M. Olin Business School ( email )

One Brookings Drive
Campus Box 1133
St. Louis, MO 63130-4899
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
227
Abstract Views
327
PlumX Metrics