Exploring the Factor Zoo With a Machine-Learning Portfolio
49 Pages Posted: 9 Aug 2018 Last revised: 4 Sep 2019
Date Written: June 24, 2018
Over the years, top journals have published a factor zoo containing hundreds of characteristics, only to see many of them losing empirical significance over time. In this paper, we perform an out-of-sample factor-zoo analysis on the rise and fall of characteristics in explaining cross-sectional stock return. To achieve this, we train different machine-learning (ML) models on 106 firm and trading characteristics to generate factor structures that relate characteristics to stock return. Using the combined forecast from ML models, we form a ML portfolio in predicted winner and loser stocks, over 18 years. The ML alpha is highly significant against all entrenched factor models, as well as a 'zoo-factor' that combines the ML portfolio's dominant characteristics using the Stambaugh and Yuan (2017) mispricing factor approach. Although the ML models are trained on the factor zoo, the ML portfolio's dominant characteristics revolve around just 10 features. Furthermore, we uncover a rotation pattern between a subset of features that proxy for arbitrage constraint on investors, and another that proxy for financial constraint on firms. Our paper provides an insight on how characteristics fundamentally explain stock returns over time.
Keywords: Machine Learning, Characteristics, Return Predictability, Portfolio Evaluation
JEL Classification: G12, G32
Suggested Citation: Suggested Citation