When to Target Customers? Retention Management using Dynamic Off-Policy Policy Learning
50 Pages Posted: 11 Dec 2022 Last revised: 2 May 2024
Date Written: May 2, 2024
Abstract
We propose a method to learn personalized customer retention management strategies when
customers’ intentions to purchase evolve over time. Working with a Japanese online platform, we
first implement a large-scale randomized experiment, in which coupons are randomly sent to first-time buyers at different times. The experimental data allow us to estimate personalized dynamic
retention policies using off-policy policy learning methods. We extend the existing methods by allowing inter-temporal budget constraints and feasibility constraints. Our offline evaluation results
show that the optimal dynamic policy is more cost-effective than baseline policies. Finally, we test
the optimal policy online to confirm its performance.
Keywords: Retention management, Off Policy Learning
Suggested Citation: Suggested Citation