Bayesian Consumer Profiling: How to Estimate Consumer Characteristics from Aggregate Data
65 Pages Posted: 2 Mar 2016 Last revised: 2 Sep 2021
Date Written: August 31, 2021
Abstract
Firms use aggregate data from data brokers (e.g., Acxiom, Experian) and external data sources (e.g., Census) to infer the likely characteristics of consumers in a target list and thus better predict consumers’ profiles and needs unobtrusively. We demonstrate that the simple count method most commonly used in this effort relies implicitly on an assumption of conditional independence that
fails to hold in many settings of managerial interest. We develop a Bayesian profiling introducing different conditional independence assumptions. We also show how to introduce additional observed covariates into this model. We use simulations to show that in managerially relevant settings, the Bayesian method will outperform the simple count method, often by an order of magnitude. We then
compare different conditional independence assumptions in two case studies. The first example estimates customers’ age on the basis of their first names; prediction errors decrease substantially. In the second example, we infer the income, occupation, and education of online visitors of a marketing analytic software company based exclusively on their IP addresses. The face validity of the predictions improves dramatically and reveals an interesting (and more complex) endogenous list-selection mechanism than the one suggested by the simple count method.
Keywords: Consumer profiling; Data augmentation; Data brokerage; Bayesian profiling; Sociodemographic profiling; First name; Age; Political partisanship; Geolocation.
JEL Classification: M3, M31, C11
Suggested Citation: Suggested Citation