Empirical Asset Pricing via Machine Learning
69 Pages Posted: 9 Nov 2018 Last revised: 13 Sep 2019
Date Written: July 21, 2018
We synthesize the field of machine learning with the canonical problem of empirical asset pricing: measuring asset risk premia. In the familiar empirical setting of cross section and time series stock return prediction, we perform a comparative analysis of methods in the machine learning repertoire, including generalized linear models, dimension reduction, boosted regression trees, random forests, and neural networks. At the broadest level, we find that machine learning offers an improved description of expected return behavior relative to traditional forecasting methods. Our implementation establishes a new standard for accuracy in measuring risk premia summarized by an unprecedented out-of-sample return prediction R2. We identify the best performing methods (trees and neural nets) and trace their predictive gains to allowance of nonlinear predictor interactions that are missed by other methods. Lastly, we find that all methods agree on the same small set of dominant predictive signals that includes variations on momentum, liquidity, and volatility. Improved risk premia measurement through machine learning can simplify the investigation into economic mechanisms of asset pricing and justifies its growing role in innovative financial technologies.
Keywords: Machine Learning, Big Data, Return Prediction, Cross-Section of Returns, Ridge Regression, (Group) Lasso, Elastic Net, Random Forest, Gradient Boosting, (Deep) Neural Networks, Fintech
Suggested Citation: Suggested Citation