Dynamic Personalization with Multiple Customer Signals: Multi-Response State Representation in Reinforcement Learning

54 Pages Posted: 24 Feb 2025 Last revised: 6 Feb 2025

See all articles by Liangzong Ma

Liangzong Ma

Harvard Business School

Ta-Wei Huang

Harvard Business School

Eva Ascarza

Harvard Business School

Ayelet Israeli

Harvard Business School - Marketing Unit

Date Written: February 05, 2025

Abstract

Reinforcement learning (RL) offers potential for optimizing sequences of customer interactions by modeling the relationships between customer states, company actions, and long-term value. However, its practical implementation often faces significant challenges. First, while companies collect detailed customer characteristics to represent customer states, these data often contain noise or irrelevant information, obscuring the true customer states. Second, existing state construction techniques focus primarily on summarizing characteristics related to short-term values, rather than capturing the broader behaviors that drive long-term customer value. These limitations hinder RL's ability to effectively learn customer dynamics and maximize long-term value. To address these challenges, we introduce a novel Multi-Response State Representation (MRSR) Learning method to enhance existing RL methods. Unlike state construction methods, MRSR utilizes rich customer signals-such as recency, engagement, and spending-to construct low-dimensional state representations that effectively predict behaviors driving long-term customer value. Using data from a free-to-play mobile game with dynamic difficulty adjustments, MRSR demonstrates significant improvements, increasing 30-day in-game currency spending by 37% compared to standard offline RL methods and 24% over advanced state representation techniques. Policy interpretation further highlights MRSR's ability to identify distinct and relevant customer states, enabling precise and targeted interventions to boost long-term engagement and spending.

Keywords: Dynamic Policy, Deep Reinforcement Learning, Customer Relationship Management, Representation Learning, Dynamic Difficulty Adjustment, Latent Variable Model

Suggested Citation

Ma, Liangzong and Huang, Ta-Wei and Ascarza, Eva and Israeli, Ayelet, Dynamic Personalization with Multiple Customer Signals: Multi-Response State Representation in Reinforcement Learning (February 05, 2025). Available at SSRN: https://ssrn.com/abstract=5126129 or http://dx.doi.org/10.2139/ssrn.5126129

Liangzong Ma (Contact Author)

Harvard Business School ( email )

Ta-Wei Huang

Harvard Business School ( email )

Boston, MA 02163
United States

Eva Ascarza

Harvard Business School ( email )

Soldiers Field
Boston, MA 02163
United States

HOME PAGE: http://evaascarza.com

Ayelet Israeli

Harvard Business School - Marketing Unit ( email )

Soldiers Field
Boston, MA 02163
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
131
Abstract Views
463
Rank
462,645
PlumX Metrics