Download this Paper Open PDF in Browser

The Big Data Newsvendor: Practical Insights from Machine Learning

52 Pages Posted: 3 Feb 2015 Last revised: 6 Sep 2016

Gah‐Yi Ban

London Business School

Cynthia Rudin

Duke University - Pratt School of Engineering; Duke University

Date Written: February 6, 2014


We investigate the data-driven newsvendor problem when one has n observations of p features related to the demand as well as historical demand data. We propose two approaches to finding the optimal order quantity in this new setting -- Machine Learning (ML) with and without regularization, and Kernel-weights Optimization (KO). We show that the resulting "Big Data" newsvendor problem can be solved by LP, MIP or QCQP programs under the ML approach, and by a simple sorting algorithm under the KO approach. We justify the use of feature information by showing that not including them yields inconsistent decisions, which translates to sub-optimal costs even with infinite amount of demand data. We then derive finite-sample performance bounds on the out-of-sample costs of the feature-based decisions, which shows (i) the "Big Data" regime, when over-fitting dominates finite-sample bias, is defined by p > O(n^{-1/(2 8/p)}\sqrt{\log{(n)}}), and (ii) both regularized ML and KO are effective methods to handle over-fitting. Finally, we apply the feature-based algorithms for nurse staffing in a hospital emergency room using a data set from a large UK teaching hospital and find that (i) the best KO and ML algorithms beat the best practice benchmark by 23% and 24% respectively in the out-of-sample cost with statistical significance at the 5% level, and (ii) the best KO algorithm is faster than the best ML algorithm by three orders of magnitude and the best practice benchmark by two orders of magnitude.

Keywords: big data, newsvendor, machine learning, Sample Average Approximation, statistical learning theory, quantile regression

JEL Classification: C44, C61,C80

Suggested Citation

Ban , Gah‐Yi and Rudin, Cynthia, The Big Data Newsvendor: Practical Insights from Machine Learning (February 6, 2014). Available at SSRN: or

Gah‐Yi Ban (Contact Author)

London Business School ( email )

Sussex Place
Regent's Park
London, London NW1 4SA
United Kingdom

Cynthia Rudin

Duke University - Pratt School of Engineering ( email )

Durham, NC 27708
United States

Duke University ( email )

Department of Computer Science
LSRC Building
Durham, NC 27708-0204
United States

Paper statistics

Abstract Views