Factor Profiling for Ultra High Dimensional Variable Selection
44 Pages Posted: 23 May 2010
Date Written: May 23, 2010
Abstract
We propose here a novel method of factor profiling (FP) for ultra high dimensional variable selection. The new method assumes that the correlation structure of the high dimensional data can be well represented by a set of low-dimensional latent factors (Fan et al., 2008). The latent factors can then be estimated consistently by eigenvalue-eigenvector decomposition. They should be profiled out subsequently from both the response and predictors. Such an operation is referred to as FP. Obviously, FP produces uncorrelated predictors. Thereafter, the method of sure independent screening (Fan and Lv, 2008, SIS) can be applied immediately. This leads to profiled independent screening (PIS). PIS is shown to be selection consistent, even if the predictor dimension is substantially larger than the sample size. To further improve PIS, a novel method of profiled sequential screening (PSS) is proposed. PSS shares similar strength as forward regression (Wang, 2009a) but is computationally even simpler. Numerical studies are presented to corroborate our theoretical findings.
Keywords: Bayesian Information Criterion, Factor Profiling, Forward Regression, Maximum Eigenvalue Ratio Criterion, Profiled Independent Screening, Profled Sequential Screening, Selection Consistency, Screening Consistency
JEL Classification: C10, C13
Suggested Citation: Suggested Citation