Persistence in Factor-Based Supervised Learning Models

33 Pages Posted: 29 Jun 2020

Date Written: June 4, 2020


In this paper, we document the importance of memory in machine learning (ML)-based models relying on firm characteristics for asset pricing. We come to three empirical conclusions. First, the pure out-of-sample fit of the models is often poor: we find that most R^2 measures are negative, especially when training samples are short. Second, we show that poor fit does not necessarily matter from an investment standpoint: what actually counts are measures of cross-sectional accuracy, which are seldom reported in the literature. Third, memory is key. The accuracy of models is maximal when both labels and features are highly autocorrelated. Relatedly, we show that investments are the most profitable when they are based on models driven by strong persistence. Average realized returns are the highest when the size of training samples is large and when the horizon of the predicted variable is also long.

Keywords: Factor investing, Machine learning, Asset Pricing, Autocorrelation

JEL Classification: C45, C53, G11, G12

Suggested Citation

Coqueret, Guillaume, Persistence in Factor-Based Supervised Learning Models (June 4, 2020). Available at SSRN: or

Guillaume Coqueret (Contact Author)

EMLYON Business School ( email )

23 Avenue Guy de Collongue
Ecully, 69132

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Abstract Views
PlumX Metrics