Overcoming the Cold Start Problem of CRM using a Probabilistic Machine Learning Approach

99 Pages Posted: 17 Mar 2017 Last revised: 2 Mar 2021

See all articles by Nicolas Padilla

Nicolas Padilla

London Business School - Department of Marketing

Eva Ascarza

Harvard Business School

Date Written: February 10, 2021


The success of Customer Relationship Management (CRM) programs ultimately depends on the firm’s ability to understand consumers’ preferences and precisely capture how these preferences may differ across customers. Only by understanding customer heterogeneity, firms can tailor their activities towards the right customers, therefore increasing the value of customers while maximizing the return on the marketing efforts. However, identifying differences across customers is a very difficult task when firms attempt to manage new customers, for whom only the first purchase has been observed. For those customers, the lack of repeated observations poses a structural challenge to infer unobserved differences across them. This is what we call the "cold start" problem of CRM, whereby companies have difficulties leveraging existing data when they attempt to make inferences about customers at the beginning of their relationship.

In this research we propose a solution to the cold start problem by developing a modeling framework that leverages the information collected at the moment of acquisition. The main aspect of the model is that it flexibly captures latent dimensions that govern both the behaviors observed at acquisition as well as future propensities to buy and to respond to marketing actions. Using probabilistic machine learning, we combine deep exponential families with the demand model, relating behaviors observed in the first purchase with consequent customer behavior. The model can be integrated with a variety of demand specifications and is flexible enough to capture a wide range of heterogeneity structures (both linear and non-linear), thus being applicable to a variety of behaviors and contexts. We validate our approach in a retail context and illustrate how the focal firm can overcome the cold start problem by augmenting the (thin) historical data for new customers using the firm's transactional database and applying the proposed modeling framework to those data. We empirically demonstrate the model's ability at identifying high-value customers as well as those most sensitive to marketing actions, right after their first purchase. Leveraging the model predictions, the firm can also identify the most relevant variables—transaction characteristics or products being purchased at the moment of acquisition—that are predictive of behaviors of interest (e.g., sensitivity to email communications).

Keywords: Customer Relationship Management (CRM), Deep Exponential Families, Probabilistic Machine Learning, Cold Start Problem.

Suggested Citation

Padilla, Nicolas and Ascarza, Eva, Overcoming the Cold Start Problem of CRM using a Probabilistic Machine Learning Approach (February 10, 2021). Columbia Business School Research Paper No. 17-37, Available at SSRN: https://ssrn.com/abstract=2933291 or http://dx.doi.org/10.2139/ssrn.2933291

Nicolas Padilla (Contact Author)

London Business School - Department of Marketing ( email )

Sussex Place
Regent's Park
London, NW1 4SA
United Kingdom

HOME PAGE: http://www.nicolaspadilla.com

Eva Ascarza

Harvard Business School ( email )

Soldiers Field
Boston, MA 02163
United States

HOME PAGE: http://evaascarza.com

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Abstract Views
PlumX Metrics