Training Trees on Tails with Applications to Portfolio Choice

34 Pages Posted: 20 Jun 2019 Last revised: 24 Feb 2020

See all articles by Guillaume Coqueret

Guillaume Coqueret

EMLYON Business School

Tony Guida

Université de Savoie - Finance and Banking; RAM Active Investments

Date Written: June 12, 2019


In this article, we investigate the impact of truncating training data when fitting regression trees. We argue that training times can be curtailed by reducing the training sample without any loss in out-of-sample accuracy as long as the prediction model has been trained on the tails of the dependent variable, that is, when ‘average’ observations have been discarded from the training sample. Filtering instances has an impact on the features that are selected to yield the splits and can help reduce overfitting by favoring predictors with monotonous impacts on the dependent variable. We test this technique in an out-of-sample exercise of portfolio selection which shows its benefits. The implications of our results are decisive for time-consuming tasks such as hyperparameter tuning and validation.

Keywords: Decision trees; Filtering training set; Factor investing; Portfolio choice; Feature selection

JEL Classification: C40; G11; G12

Suggested Citation

Coqueret, Guillaume and Guida, Tony, Training Trees on Tails with Applications to Portfolio Choice (June 12, 2019). Available at SSRN: or

Guillaume Coqueret (Contact Author)

EMLYON Business School ( email )

23 Avenue Guy de Collongue
Ecully, 69132

Tony Guida

Université de Savoie - Finance and Banking ( email )

27 Rue Marcoz
Chambéry, 73011

RAM Active Investments ( email )

8 rue du rhone
geneva, 1204

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Abstract Views
PlumX Metrics