Factor Profiling for Ultra High Dimensional Variable Selection

44 Pages Posted: 23 May 2010

See all articles by Hansheng Wang

Hansheng Wang

Peking University - Guanghua School of Management

Date Written: May 23, 2010


We propose here a novel method of factor profiling (FP) for ultra high dimensional variable selection. The new method assumes that the correlation structure of the high dimensional data can be well represented by a set of low-dimensional latent factors (Fan et al., 2008). The latent factors can then be estimated consistently by eigenvalue-eigenvector decomposition. They should be profiled out subsequently from both the response and predictors. Such an operation is referred to as FP. Obviously, FP produces uncorrelated predictors. Thereafter, the method of sure independent screening (Fan and Lv, 2008, SIS) can be applied immediately. This leads to profiled independent screening (PIS). PIS is shown to be selection consistent, even if the predictor dimension is substantially larger than the sample size. To further improve PIS, a novel method of profiled sequential screening (PSS) is proposed. PSS shares similar strength as forward regression (Wang, 2009a) but is computationally even simpler. Numerical studies are presented to corroborate our theoretical findings.

Keywords: Bayesian Information Criterion, Factor Profiling, Forward Regression, Maximum Eigenvalue Ratio Criterion, Profiled Independent Screening, Profled Sequential Screening, Selection Consistency, Screening Consistency

JEL Classification: C10, C13

Suggested Citation

Wang, Hansheng, Factor Profiling for Ultra High Dimensional Variable Selection (May 23, 2010). Available at SSRN: https://ssrn.com/abstract=1613452 or http://dx.doi.org/10.2139/ssrn.1613452

Hansheng Wang (Contact Author)

Peking University - Guanghua School of Management ( email )

Peking University
Beijing, Beijing 100871

HOME PAGE: http://hansheng.gsm.pku.edu.cn

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Abstract Views
PlumX Metrics