# Machine Learning Panel Data Regressions with Heavy-tailed Dependent Data: Theory and Application

51 Pages Posted: 24 Sep 2020 Last revised: 23 Nov 2021

See all articles by Andrii Babii

## Andrii Babii

University of North Carolina at Chapel Hill

## Ryan T. Ball

The Stephen M. Ross School of Business at the University of Michigan

## Eric Ghysels

University of North Carolina Kenan-Flagler Business School; University of North Carolina (UNC) at Chapel Hill - Department of Economics

## Jonas Striaukas

Louvain Finance; UC Louvain and F.R.S.-FNRS

Date Written: August 6, 2020

### Abstract

The paper introduces structured machine learning regressions for heavy-tailed dependent panel data potentially sampled at different frequencies. We focus on the sparse-group LASSO regularization. This type of regularization can take advantage of the mixed frequency time series panel data structures and improve the quality of the estimates. We obtain oracle inequalities for the pooled and fixed effects sparse-group LASSO panel data estimators recognizing that financial and economic data can have fat tails. To that end, we leverage on a new Fuk-Nagaev concentration inequality for panel data consisting of heavy-tailed $\tau$-mixing processes.

Keywords: high-dimensional panels, large N and T panels, mixed-frequency data, sparse-group LASSO, fat tails

JEL Classification: C22, C51, C52, C53, C55, C58, G17

Suggested Citation

Babii, Andrii and Ball, Ryan T. and Ghysels, Eric and Striaukas, Jonas and Striaukas, Jonas, Machine Learning Panel Data Regressions with Heavy-tailed Dependent Data: Theory and Application (August 6, 2020). Available at SSRN: https://ssrn.com/abstract=3670847 or http://dx.doi.org/10.2139/ssrn.3670847