Machine Learning-Based Financial Statement Analysis
69 Pages Posted: 27 Jan 2020 Last revised: 3 Dec 2020
Date Written: November 25, 2020
This paper explores the application of machine learning methods to financial statement analysis. We compare a range of models in the machine learning repertoire in their ability to predict the sign and magnitude of abnormal stock returns around earnings announcements based on past financial statement data alone. Random Forests produce the most accurate forecasts and the highest abnormal returns. (Nonlinear) neural network-based models perform relatively better for predictions of extreme market reactions, while the linear methods are relatively better in predicting moderate market reactions. Long-short portfolios based on model predictions generate sizable abnormal returns, which seem to decay over time. Abnormal returns are robust to various risk factors and load in expected ways on size, value and accruals. Analysing the underlying economic drivers of the performance of the Random Forests, we find that the models select as most important predictors financial variables required to forecast free cash flows and firm characteristics that are known cross-sectional predictors of stock returns.
Keywords: financial statement analysis, fundamental value, machine learning, earnings announcement, accounting-based anomalies, prediction
JEL Classification: G12, G14, M41
Suggested Citation: Suggested Citation