The Memorization Problem: Can We Trust LLMs' Economic Forecasts?
113 Pages Posted: 18 Apr 2025 Last revised: 2 Apr 2026
Date Written: April 15, 2025
Abstract
Large language models (LLMs) cannot be trusted for economic forecasts during periods covered by their training data. Under black-box access, counterfactual forecasting ability is non-identified when the model has seen the realized values: any observed output is consistent with both genuine skill and memorization. Any evidence of memorization represents only a lower bound on encoded knowledge. We demonstrate LLMs have memorized economic and financial data, recalling exact values before their knowledge cutoff. Instructions to respect historical boundaries fail to prevent recall-level accuracy, and masking fails as LLMs reconstruct entities and dates from minimal context. Post-cutoff, we observe no recall. Memorization extends to embeddings.
Keywords: Large language models, Generative AI, Forecasting, ChatGPT, Memorization, Lookahead Bias, Textual Analysis, Embeddings
JEL Classification: C53, C58, E37, G10, G17
Suggested Citation: Suggested Citation