Missing Data in Asset Pricing Panels
64 Pages Posted: 18 Nov 2021
Date Written: September 28, 2021
Missing data for return predictors is a common problem in cross sectional asset pricing studies. Most papers do not explicitly discuss how they treat missing data but conventional treatments focus on complete cases for all predictors or impute the unconditional mean for the missing predictor. Both methods have undesirable properties - they are either inefficient or lead to biased estimators and incorrect inference. We propose a simple and computationally attractive alternative approach using conditional mean imputations and weighted least squares. This method allows us to use all sample points with observed returns, it results in valid inference, and it can be applied in non-linear and high-dimensional settings. We map our estimator into a GMM framework to study its relative efficiency and find that it performs almost as well as the efficient but computationally costly GMM estimator in many cases. We apply our procedure to a large panel of return predictors and find that it leads to improved out-of-sample predictability.
Keywords: Cross Section of Returns, Missing Data, Expected Returns, Generalized Method of Moments
JEL Classification: C14, C58, G12
Suggested Citation: Suggested Citation