Autoregressive Random Forests: Machine Learning and Lag Selection for Financial Research
Computational Economics (https://doi.org/10.1007/s10614-023-10429-9)
42 Pages Posted: 1 Jun 2022 Last revised: 17 Aug 2023
Date Written: May 24, 2022
Abstract
This paper demonstrates the use of Random Regression Forests (RRF) for optimal lag selection. Using an extended sample of 144 data series, of various data types with different frequencies and sample sizes, we perform optimal lag selection using RRF and compare the results with seven “traditional” information criteria as well as with three other machine learning approaches. We show that the different information criteria produce differing outcomes in terms of optimal lag selection. To quantify performance, we compare the forecast errors on autoregressive models using the optimal lags selected by the criteria and demonstrate that RRF outperforms other approaches. We provide suggestions to researchers as to which approach to use, under different combinations of data type/data frequency and data type/sample size.
Keywords: random regression forest, optimal lag, Lasso, Ridge regression, Bayesian model averaging
JEL Classification: C53, C13, C63, E52, E61, G17
Suggested Citation: Suggested Citation