Persistence in Factor-Based Supervised Learning Models
33 Pages Posted: 29 Jun 2020
Date Written: June 4, 2020
In this paper, we document the importance of memory in machine learning (ML)-based models relying on firm characteristics for asset pricing. We come to three empirical conclusions. First, the pure out-of-sample fit of the models is often poor: we find that most R^2 measures are negative, especially when training samples are short. Second, we show that poor fit does not necessarily matter from an investment standpoint: what actually counts are measures of cross-sectional accuracy, which are seldom reported in the literature. Third, memory is key. The accuracy of models is maximal when both labels and features are highly autocorrelated. Relatedly, we show that investments are the most profitable when they are based on models driven by strong persistence. Average realized returns are the highest when the size of training samples is large and when the horizon of the predicted variable is also long.
Keywords: Factor investing, Machine learning, Asset Pricing, Autocorrelation
JEL Classification: C45, C53, G11, G12
Suggested Citation: Suggested Citation