Between the Lines: Textual Features in Financial Reports and Expected Stock Returns
56 Pages Posted: 19 May 2025
Date Written: April 21, 2025
Abstract
We comprehensively reassess the asset pricing implications of textual characteristics of financial reports. Using machine learning techniques, we aggregate information from features capturing similarity, readability, and sentiment of 10-K and 10-Q filings. The results show that the economic benefits of traditional textual analysis are limited. Text attributes fail to generate reliable return patterns or provide information beyond conventional stock characteristics. When return predictability does exist, it arises solely from similarity measures and occurs in specific market segments, such as microcaps, firms that receive little investor attention, and overpriced securities. Furthermore, the cross-sectional variation in expected returns fades over time and prevails mainly in hard-to-trade stocks with minimal practical significance.
Keywords: textual analysis, readability, similarity, sentiment, cross-section of stock returns
JEL Classification: G11
Suggested Citation: Suggested Citation