Machine Learning for Pattern Discovery in Management Research

44 Pages Posted: 14 Jan 2020 Last revised: 26 Jun 2020

See all articles by Prithwiraj Choudhury

Prithwiraj Choudhury

Harvard University - Business School (HBS)

Ryan Allen

Harvard University - Business School (HBS)

Michael Endres

Harvard University - Institute for Quantitative Social Sciences

Date Written: June 23, 2020

Abstract

Supervised machine learning (ML) methods are a powerful toolkit for discovering robust patterns in quantitative data. The patterns identified by ML could be used for exploratory inductive or abductive research, or for post-hoc analysis of regression results to detect patterns that may have gone unnoticed. However, ML models should not be treated as the result of a deductive causal test. To demonstrate the application of ML for pattern discovery, we implement ML algorithms to study employee turnover at a large technology company. We interpret the relationships between variables using partial dependence plots, which uncover surprising nonlinear and interdependent patterns between variables that may have gone unnoticed using traditional methods. To guide readers evaluating ML for pattern discovery, we provide guidance for evaluating model performance, highlight human decisions in the process, and warn of common misinterpretation pitfalls. An online appendix provides code and data to implement the algorithms demonstrated in the paper.

Keywords: machine learning, supervised machine learning, induction, abduction, exploratory data analysis, pattern discovery, decision trees, random forests, neural networks, ROC curve, confusion matrix, partial dependence plots

Suggested Citation

Choudhury, Prithwiraj and Allen, Ryan and Endres, Michael, Machine Learning for Pattern Discovery in Management Research (June 23, 2020). Harvard Business School Technology & Operations Mgt. Unit Working Paper No. 19-032, Available at SSRN: https://ssrn.com/abstract=3518780 or http://dx.doi.org/10.2139/ssrn.3518780

Prithwiraj Choudhury (Contact Author)

Harvard University - Business School (HBS) ( email )

Soldiers Field Road
Morgan 270C
Boston, MA 02163
United States

Ryan Allen

Harvard University - Business School (HBS)

Soldiers Field Road
Morgan 270C
Boston, MA 02163
United States

Michael Endres

Harvard University - Institute for Quantitative Social Sciences ( email )

1875 Cambridge Street
Cambridge, MA 02138
United States

Here is the Coronavirus
related research on SSRN

Paper statistics

Downloads
55
Abstract Views
312
rank
401,707
PlumX Metrics