Large Dimensional Latent Factor Modeling with Missing Observations and Applications to Causal Inference
73 Pages Posted: 28 Oct 2019 Last revised: 21 Nov 2019
Date Written: November 14, 2019
This paper develops the inferential theory for latent factor models estimated from large dimensional panel data with missing observations. We estimate a latent factor model by applying principal component analysis to an adjusted covariance matrix estimated from partially observed panel data. We derive the asymptotic distribution for the estimated factors, loadings and the imputed values under a general approximate factor model. The key application is to estimate counterfactual outcomes in causal inference from panel data. The unobserved control group is modeled as missing values, which are inferred from the latent factor model. The inferential theory for the imputed values allows us to test for individual treatment effects at any time. We apply our method to portfolio investment strategies and find that around 14% of their average returns are significantly reduced by the academic publication of these strategies.
Keywords: Factor Analysis, Principal Components, Synthetic Control, Causal Inference, Treatment Effect, Missing Entry, Large-Dimensional Panel Data, Large N and T, Matrix Completion
JEL Classification: C14, C38, C55, G12
Suggested Citation: Suggested Citation