Dynamic Personalization with Multiple Customer Signals: Multi-Response State Representation in Reinforcement Learning
54 Pages Posted: 24 Feb 2025 Last revised: 6 Feb 2025
Date Written: February 05, 2025
Abstract
Reinforcement learning (RL) offers potential for optimizing sequences of customer interactions by modeling the relationships between customer states, company actions, and long-term value. However, its practical implementation often faces significant challenges. First, while companies collect detailed customer characteristics to represent customer states, these data often contain noise or irrelevant information, obscuring the true customer states. Second, existing state construction techniques focus primarily on summarizing characteristics related to short-term values, rather than capturing the broader behaviors that drive long-term customer value. These limitations hinder RL's ability to effectively learn customer dynamics and maximize long-term value. To address these challenges, we introduce a novel Multi-Response State Representation (MRSR) Learning method to enhance existing RL methods. Unlike state construction methods, MRSR utilizes rich customer signals-such as recency, engagement, and spending-to construct low-dimensional state representations that effectively predict behaviors driving long-term customer value. Using data from a free-to-play mobile game with dynamic difficulty adjustments, MRSR demonstrates significant improvements, increasing 30-day in-game currency spending by 37% compared to standard offline RL methods and 24% over advanced state representation techniques. Policy interpretation further highlights MRSR's ability to identify distinct and relevant customer states, enabling precise and targeted interventions to boost long-term engagement and spending.
Keywords: Dynamic Policy, Deep Reinforcement Learning, Customer Relationship Management, Representation Learning, Dynamic Difficulty Adjustment, Latent Variable Model
Suggested Citation: Suggested Citation