XSTrees: A Tree Sampling Framework for Interpretable Tree Ensembles
Posted: 8 Jan 2020
Date Written: December 17, 2019
The objective of this paper is to introduce and study a novel Machine Learning model for classification and regression that is interpretable, scalable, stable, and competitive in terms of accuracy with other machine learning methods. In particular, we propose an Extended Sampled Trees (XSTrees) algorithm, which extends linearly the splits of individual tree-based methods such as CART. The method then introduces a distribution (based on data) over this tree space to account for sampling variability. We prove the method is consistent. Furthermore, we benchmark this new method against state-of-the-art methods (for example, XGBoost and other tree ensemble models) on synthetic data, publicly-available datasets, and real-world data. Accurate classification and prediction is essential for various data-driven operations management problems. Hence our proposed has the potential to influence a wide range of decision making problems related to pricing, logistics, and others.
Keywords: Machine Learning, Ensemble Tree Methods, Random Forests, Predictive Analytics, Regression, Classification, Interpretability
Suggested Citation: Suggested Citation