Targeting predictors in random forest regression

44 Pages Posted: 28 Apr 2020 Last revised: 29 Oct 2020

See all articles by Daniel Borup

Daniel Borup

Aarhus University, CREATES, DFI

Bent Jesper Christensen

Aarhus University; Aarhus University; Aarhus University

Nicolaj Mühlbach

Massachusetts Institute of Technology

Mikkel Slot Nielsen

Columbia University

Date Written: April 3, 2020

Abstract

Random forest regression (RF) is an extremely popular tool for the analysis of high-dimensional data. Nonetheless, its benefits may be lessened in sparse settings, due to weak predictors, and a pre-estimation dimension reduction (targeting) step is required. We show that proper targeting controls the probability of placing splits along strong predictors, thus providing an important complement to RF's feature sampling. This is supported by simulations using representative finite samples. Moreover, we quantify the immediate gain from targeting in terms of increased strength of individual trees. Macroeconomic and financial applications show that the bias-variance tradeoff implied by targeting, due to increased correlation among trees in the forest, is balanced at a medium degree of targeting, selecting the best 10-30% of commonly applied predictors. Improvements in predictive accuracy of targeted RF relative to ordinary RF are considerable, up to 12-13%, occurring both in recessions and expansions, particularly at long horizons.

Keywords: Random Forests, LASSO, High-Dimensional Forecasting, Weak Predictors, Targeted Predictors

JEL Classification: C53, C55, E17, G12

Suggested Citation

Borup, Daniel and Christensen, Bent Jesper and Mühlbach, Nicolaj and Nielsen, Mikkel Slot, Targeting predictors in random forest regression (April 3, 2020). Available at SSRN: https://ssrn.com/abstract=3551557 or http://dx.doi.org/10.2139/ssrn.3551557

Daniel Borup (Contact Author)

Aarhus University, CREATES, DFI ( email )

School of Business and Social Sciences
Fuglesangs Alle 4
Aarhus V, 8210
Denmark

Bent Jesper Christensen

Aarhus University ( email )

Fuglesangs Alle 4
DK-8210 Aarhus V, 8210
Denmark

Aarhus University ( email )

Fuglesangs Alle 4
DK-8210 Aarhus V, 8210
Denmark

Aarhus University ( email )

Fuglesangs Alle 4
DK-8210 Aarhus V, 8210
Denmark

Nicolaj Mühlbach

Massachusetts Institute of Technology ( email )

The Morris and Sophie Chang Building
50 Memorial Drive, Bldg E52-300
Cambridge, MA 02142
United States
8572226395 (Phone)

Mikkel Slot Nielsen

Columbia University ( email )

1255 Amsterdam Avenue
New York, NY 10027
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
230
Abstract Views
1,090
rank
165,837
PlumX Metrics