Forecasting with Partial Least Squares When a Large Number of Predictors Are Available
95 Pages Posted: 19 Oct 2022
Date Written: February 12, 2022
We consider Partial Least Squares (PLS) estimation of a time-series forecasting model with the data containing a large number (T) of time series observations on each of a large number (N) of predictor variables. In the model, a subset or a whole set of the latent common factors in predictors are determinants of a single target variable to be forecasted. The factors relevant for forecasting the target variable, which we refer to as PLS factors, can be sequentially generated by a method called "Nonlinear Iterative Partial Least Squares" (NIPLS) algorithm. Two main findings from our asymptotic analysis are the following. First, the optimal number of the PLS factors for forecasting could be much smaller than the number of the common factors in the original predictor variables relevant for the target variable. Second, as more than the optimal number of PLS factors is used, the out-of-sample forecasting power of the factors could rather decrease while their in-sample explanatory power may increase. Our Monte Carlo simulation results confirm these asymptotic results. In addition, our simulation results indicate that unless very large samples are used, the out-of-sample forecasting power of the PLS factors is often higher when a smaller than the asymptotically optimal number of factors are used. We find that the out-of-sample forecasting power of the PLS factors often decreases as the second, third, and more factors are added, even if the asymptotically optimal number of the factors is greater than one.
Keywords: Partial Least Squares, Factors, Forecasting
JEL Classification: C51, C53, C55
Suggested Citation: Suggested Citation