Data Aggregation and Demand Prediction
34 Pages Posted: 30 Jun 2019 Last revised: 31 Mar 2020
Date Written: June 28, 2019
Retailers collect large volumes of transaction data with the goal of predicting future demand. We study how retailers could use clustering techniques to improve demand prediction accuracy. High = accuracy in demand prediction allows retailers to better manage their inventory, and ultimately mitigate stock-outs and excess supply. It is thus important for retailers to leverage their data for demand prediction. A typical retail setting involves predicting demand for hundreds of products simultaneously. While some products have a large amount of historical data, others were recently introduced and transaction data can be scarce. A common approach is to cluster several products together and estimate a joint model at the cluster level. In this vein, one can estimate some model parameters by aggregating the data from several items, and other parameters at the item level. In this paper, we propose a practical method—referred to as the Data Aggregation with Clustering (DAC) algorithm—that balances the tradeoff between data aggregation and model flexibility. The DAC allows us to predict demand while optimally identifying the features that should be estimated at the (i) item, (ii) cluster, and (iii) aggregated levels. We analytically show that the DAC yields a consistent estimate along with improved asymptotic properties relative to the traditional ordinary least squares method that treats different items in a decentralized fashion. Using both simulated and real data, we illustrate the improvement in prediction accuracy obtained by the DAC relative to several common benchmarks. Interestingly, the DAC not only has theoretical and practical advantages, it also helps retailers discover useful managerial insights.
Keywords: Retail analytics, demand prediction, data aggregation, clustering
Suggested Citation: Suggested Citation