Boosted Generalized Normal Distributions: Integrating Machine Learning with Operations Knowledge

28 Pages Posted: 1 Aug 2024

See all articles by Ragıp Gürlek

Ragıp Gürlek

Emory University - Goizueta Business School

Francis de Véricourt

ESMT European School of Management and Technology

Donald K.K. Lee

Emory University - Goizueta Business School; Emory University - Dept of Biostatistics & Bioinformatics

Date Written: July 26, 2024

Abstract

Applications of machine learning (ML) techniques to operational settings often face two challenges: i) ML methods mostly provide point predictions whereas many operational problems require distributional information; and ii) They typically do not incorporate the extensive body of knowledge in the operations literature, particularly the theoretical and empirical findings that characterize specific distributions. We introduce a novel and rigorous methodology, the Boosted Generalized Normal Distribution (bGND), to address these challenges. The Generalized Normal Distribution (GND) encompasses a wide range of parametric distributions commonly encountered in operations, and bGND leverages gradient boosting with tree learners to flexibly estimate the parameters of the GND as functions of covariates. We establish bGND's statistical consistency, thereby extending this key property to special cases studied in the ML literature that lacked such guarantees. Using data from a large academic emergency department in the United States, we show that the distributional forecasting of patient wait and service times can be meaningfully improved by leveraging findings from the healthcare operations literature. Specifically, bGND performs 6% and 9% better than the distribution-agnostic ML benchmark used to forecast wait and service times respectively. Further analysis suggests that these improvements translate into a 9% increase in patient satisfaction and a 4% reduction in mortality for myocardial infarction patients. Our work underscores the importance of integrating ML with operations knowledge to enhance distributional forecasts.

Keywords: Distributional Machine Learning, Gradient Boosting, Wait Times, Service Times, Emergency Departments, Healthcare Operations Boosted Generalized Normal Distributions

Suggested Citation

Gürlek, Ragıp and de Véricourt, Francis and Lee, Donald K.K. and Lee, Donald K.K., Boosted Generalized Normal Distributions: Integrating Machine Learning with Operations Knowledge (July 26, 2024). Available at SSRN: https://ssrn.com/abstract=4906838 or http://dx.doi.org/10.2139/ssrn.4906838

Ragıp Gürlek

Emory University - Goizueta Business School ( email )

1300 Clifton Road
Atlanta, GA 30322-2722
United States

Francis De Véricourt

ESMT European School of Management and Technology ( email )

Schlossplatz 1
10117 Berlin
Germany

Donald K.K. Lee (Contact Author)

Emory University - Goizueta Business School ( email )

1300 Clifton Road
Atlanta, GA 30322-2722
United States

Emory University - Dept of Biostatistics & Bioinformatics ( email )

Atlanta, GA 30322
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
106
Abstract Views
479
Rank
545,029
PlumX Metrics