Target PCA: Transfer Learning Large Dimensional Panel Data
48 Pages Posted: 27 Dec 2022 Last revised: 30 Aug 2023
Date Written: December 14, 2022
Abstract
This paper develops a novel method to estimate a latent factor model for a large target panel with missing observations by optimally using the information from auxiliary panel data sets. We refer to our estimator as target-PCA. Transfer learning from auxiliary panel data allows us to deal with a large fraction of missing observations and weak signals in the target panel. We show that our estimator is more efficient and can consistently estimate weak factors, which are not identifiable with conventional methods. We provide the asymptotic inferential theory for target-PCA under very general assumptions on the approximate factor model and missing patterns. In an empirical study of imputing data in a mixed-frequency macroeconomic panel, we demonstrate that target-PCA significantly outperforms all benchmark methods.
Keywords: Factor Analysis, Principal Components, Transfer Learning, Multiple Data Sets, Large-Dimensional Panel Data, Large N and T, Missing Data, Weak Factors, Causal Inference
JEL Classification: C14, C38, C55, G12
Suggested Citation: Suggested Citation