Properties of Financial Texts
40 Pages Posted: 29 Nov 2023
Date Written: November 5, 2023
Abstract
Statistical properties of unstructured data are largely unknown. I find that counts of words (positive, negative, text length), their combinations, and measures constructed from them are often non-stationary. For most of these time series, the ADF test rejects the null hypothesis of unit root presence. On the other hand, the KPSS test rejects trend stationarity. Visual evidence aligns with the KPSS outcome. This pattern is more pronounced for daily data. A direct comparison between conventional frequency-based measure of news sentiment and a stationary counterpart demonstrates the economic impact. Predicting market returns with a non-stationary word frequency measure results in contradictory empirical findings. Forecast errors and prediction beta are higher in recessions than expansions at the same time. After accounting for the stationarity, the magnitude of beta decreases by over 50%, implying that the sentiment's influence on the equity market returns has been severely overstated.
Keywords: Stationarity, Textual Analysis, Business Cycle, Sentiment, Forecast Errors
JEL Classification: C5, E32
Suggested Citation: Suggested Citation