Machine Learning Panel Data Regressions with an Application to Nowcasting Price Earnings Ratios
42 Pages Posted: 24 Sep 2020 Last revised: 5 Oct 2020
Date Written: August 6, 2020
This paper introduces structured machine learning regressions for prediction and nowcasting with panel data consisting of series sampled at different frequencies. Motivated by the empirical problem of predicting corporate earnings for a large cross-section of firms with macroeconomic, financial, and news time series sampled at different frequencies, we focus on the sparse-group LASSO regularization. This type of regularization can take advantage of the mixed frequency time series panel data structures and we find that it empirically outperforms the unstructured machine learning methods. We obtain oracle inequalities for the pooled and fixed effects sparse-group LASSO panel data estimators recognizing that financial and economic data exhibit heavier than Gaussian tails. To that end, we leverage on a novel Fuk-Nagaev concentration inequality for panel data consisting of heavy-tailed $\tau$-mixing processes which may be of independent interest in other high-dimensional panel data settings.
Keywords: corporate earnings, nowcasting, high-dimensional panels, mixed frequency data, text data, sparse-group LASSO, heavy-tailed t-mixing processes, Fuk-Nagaev inequality
JEL Classification: C22, C51, C52, C53, C55, C58, G17
Suggested Citation: Suggested Citation