The Use of Binary Choice Forests to Model and Estimate Discrete Choices

85 Pages Posted: 2 Aug 2019 Last revised: 3 Apr 2021

See all articles by Ningyuan Chen

Ningyuan Chen

University of Toronto at Mississauga - Department of Management; University of Toronto - Rotman School of Management

Guillermo Gallego

HKUST

Zhuodong Tang

Department of Industrial Engineering & Decision Analytics

Date Written: August 2, 2019

Abstract

We show the equivalence of discrete choice models and a forest of binary decision trees. This suggests that standard machine learning techniques based on random forests can serve to estimate discrete choice models with an interpretable output: the underlying trees can be viewed as the internal choice process of customers. Our data-driven theoretical results show that random forests can predict the choice probability of any discrete choice model consistently. Moreover, our algorithm predicts unseen assortments with mechanisms and errors that can be theoretically analyzed. We also prove that the splitting criterion in random forests, the Gini index, is capable of recovering preference rankings of customers. The framework has unique practical advantages: it can capture behavioral patterns such as irrationality or sequential searches; it handles nonstandard formats of training data that result from aggregation; it can measure product importance based on how frequently a random customer would make decisions depending on the presence of the product; it can also incorporate price information and customer features. Our numerical results show that using random forests to estimate customer choices can outperform the best parametric models in synthetic and real datasets when presented with enough data or when the underlying discrete choice model cannot be correctly specified by existing parametric models.

Keywords: machine learning, online retailing, discrete choice model, data driven, random forest

Suggested Citation

Chen, Ningyuan and Gallego, Guillermo and Tang, Zhuodong, The Use of Binary Choice Forests to Model and Estimate Discrete Choices (August 2, 2019). Available at SSRN: https://ssrn.com/abstract=3430886 or http://dx.doi.org/10.2139/ssrn.3430886

Ningyuan Chen

University of Toronto at Mississauga - Department of Management ( email )


Canada

University of Toronto - Rotman School of Management ( email )

105 St. George st
Toronto, ON M5S 3E6
Canada

Guillermo Gallego

HKUST ( email )

Clearwater Bay
Kowloon, 999999
Hong Kong

HOME PAGE: http://https://seng.ust.hk/about/people/faculty/guillermo-gallego

Zhuodong Tang (Contact Author)

Department of Industrial Engineering & Decision Analytics ( email )

Room 6542, Academic Building
Clear Water Bay
Kowloon
Hong Kong

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
236
Abstract Views
1,268
rank
177,064
PlumX Metrics