XSTrees: A Tree Sampling Framework for Interpretable Tree Ensembles

Posted: 8 Jan 2020

See all articles by Georgia Perakis

Georgia Perakis

Massachusetts Institute of Technology (MIT) - Sloan School of Management

Divya Singhvi

Massachusetts Institute of Technology (MIT) - Operations Research Center

Omar Skali-Lami

Massachusetts Institute of Technology (MIT) - Operations Research Center

Date Written: December 17, 2019

Abstract

The objective of this paper is to introduce and study a novel Machine Learning model for classification and regression that is interpretable, scalable, stable, and competitive in terms of accuracy with other machine learning methods. In particular, we propose an Extended Sampled Trees (XSTrees) algorithm, which extends linearly the splits of individual tree-based methods such as CART. The method then introduces a distribution (based on data) over this tree space to account for sampling variability. We prove the method is consistent. Furthermore, we benchmark this new method against state-of-the-art methods (for example, XGBoost and other tree ensemble models) on synthetic data, publicly-available datasets, and real-world data. Accurate classification and prediction is essential for various data-driven operations management problems. Hence our proposed has the potential to influence a wide range of decision making problems related to pricing, logistics, and others.

Keywords: Machine Learning, Ensemble Tree Methods, Random Forests, Predictive Analytics, Regression, Classification, Interpretability

Suggested Citation

Perakis, Georgia and Singhvi, Divya and Skali-Lami, Omar, XSTrees: A Tree Sampling Framework for Interpretable Tree Ensembles (December 17, 2019). Available at SSRN: https://ssrn.com/abstract=3505431

Georgia Perakis

Massachusetts Institute of Technology (MIT) - Sloan School of Management ( email )

100 Main Street
E62-565
Cambridge, MA 02142
United States

Divya Singhvi

Massachusetts Institute of Technology (MIT) - Operations Research Center ( email )

77 Massachusetts Avenue
Bldg. E 40-149
Cambridge, MA 02139
United States

Omar Skali-Lami (Contact Author)

Massachusetts Institute of Technology (MIT) - Operations Research Center ( email )

77 Massachusetts Avenue
Bldg. E 40-149
Cambridge, MA 02139
United States

Here is the Coronavirus
related research on SSRN

Paper statistics

Abstract Views
968
PlumX Metrics