Data Aggregation and Demand Prediction

34 Pages Posted: 30 Jun 2019 Last revised: 31 Mar 2020

See all articles by Maxime Cohen

Maxime Cohen

Desautels Faculty of Management, McGill University

Kevin Jiao

New York University (NYU) - Leonard N. Stern School of Business

Renyu (Philip) Zhang

New York University Shanghai

Date Written: June 28, 2019


Retailers collect large volumes of transaction data with the goal of predicting future demand. We study how retailers could use clustering techniques to improve demand prediction accuracy. High = accuracy in demand prediction allows retailers to better manage their inventory, and ultimately mitigate stock-outs and excess supply. It is thus important for retailers to leverage their data for demand prediction. A typical retail setting involves predicting demand for hundreds of products simultaneously. While some products have a large amount of historical data, others were recently introduced and transaction data can be scarce. A common approach is to cluster several products together and estimate a joint model at the cluster level. In this vein, one can estimate some model parameters by aggregating the data from several items, and other parameters at the item level. In this paper, we propose a practical method—referred to as the Data Aggregation with Clustering (DAC) algorithm—that balances the tradeoff between data aggregation and model flexibility. The DAC allows us to predict demand while optimally identifying the features that should be estimated at the (i) item, (ii) cluster, and (iii) aggregated levels. We analytically show that the DAC yields a consistent estimate along with improved asymptotic properties relative to the traditional ordinary least squares method that treats different items in a decentralized fashion. Using both simulated and real data, we illustrate the improvement in prediction accuracy obtained by the DAC relative to several common benchmarks. Interestingly, the DAC not only has theoretical and practical advantages, it also helps retailers discover useful managerial insights.

Keywords: Retail analytics, demand prediction, data aggregation, clustering

Suggested Citation

Cohen, Maxime and Jiao, Kevin and Zhang, Renyu, Data Aggregation and Demand Prediction (June 28, 2019). NYU Stern School of Business, Available at SSRN: or

Maxime Cohen (Contact Author)

Desautels Faculty of Management, McGill University ( email )

1001 Sherbrooke St. W
Montreal, Quebec H3A 1G5

Kevin Jiao

New York University (NYU) - Leonard N. Stern School of Business ( email )

44 W. 4th St
Suite 9-160
New York, NY 10012
United States

Renyu Zhang

New York University Shanghai ( email )

1555 Century Avenue
Shanghai, 200122
86-21-20595135 (Phone)


Here is the Coronavirus
related research on SSRN

Paper statistics

Abstract Views
PlumX Metrics