Customer-Base Analysis on a 'Data Diet': Model Inference Using Repeated Cross-Sectional Summary (RCSS) Data
38 Pages Posted: 14 Nov 2010 Last revised: 22 Dec 2013
Date Written: December 21, 2013
Abstract
We address a critical question that many firms are facing in this era of "big data'': Can customer data be stored and analyzed in an easy-to-manage and scalable manner without significantly compromising the inferences that can be made about the customers' transaction activity? We address this question in the context of customer-base analysis. A number of researchers have developed customer-base analysis models that perform very well given detailed individual-level data. We explore the possibility of estimating these models using aggregated data summaries alone, namely repeated cross-sectional summaries (RCSS) of the transaction data (e.g., four quarterly histograms). Such summaries are easy to create, visualize, and distribute, irrespective of the size of the customer base. An added advantage of RCSS data is that individual customers cannot be identified, which makes it desirable from a privacy viewpoint as well. We focus on the widely used Pareto/NBD model and carry out a comprehensive simulation study covering a vast spectrum of market scenarios. Our results consistently and convincingly establish that model performance associated with the use of three or four cross-sections of RCSS data (as judged by model fit, parameter recovery, and forward-looking metrics of customer value) can closely match the model performance associated with the use of individual-level data. We confirm the results of the simulations on a real dataset of purchases from an online fashion retailer. The thesis of our approach is that existing statistical models continue to have value in a "big data'' world, but to harness this value one may want to approach estimation of these models in a different manner.
Keywords: Customer-base analysis, probability models, Pareto/NBD, scalability, data aggregation, information loss
JEL Classification: C15, C23, C24, C51, C53, C81, M31
Suggested Citation: Suggested Citation
Do you have a job opening that you would like to promote on SSRN?
Recommended Papers
-
New Perspectives on Customer 'Death' Using a Generalization of the Pareto/NBD Model
By Kinshuk Jerath, Peter Fader, ...
-
By Siddharth S. Singh, Sharad Borle, ...
-
Incorporating Direct Marketing Activity into Latent Attrition Models
By David A. Schweidel and George Knox
-
Are Revived Customers as Good as New?
By Shyam Gopinath, Robert Blattberg, ...
-
Extending the BG/NBD: A Simple Model of Purchases and Complaints
By Rutger Van Oest and George Knox
-
Customer Complaints and Recovery Effectiveness: A Customer Base Approach
By George Knox and Rutger Van Oest
-
Generalizing Latent Attrition Models for Multi-Activity Customer Base Analysis
By David A. Schweidel, Young-hoon Park, ...
-
The Need for Market Segmentation in Buy-Till-You-Defect Models
By Evsen Korkmaz, D. Fok, ...