Ensembling Experiments to Optimize Interventions along Customer Journey: A Reinforcement Learning Approach

Posted: 11 Oct 2021

See all articles by Yicheng Song

Yicheng Song

University of Minnesota - Twin Cities - Carlson School of Management

Tianshu Sun

University of Southern California - Marshall School of Business

Date Written: October 8, 2021

Abstract

Randomized experiment (A/B testing) is the holy grail of causal inference and has been widely adopted by firms to evaluate various online interventions (the design of website, creative content, pricing, promotion). Most of such randomized experiments are designed with a goal to nail down the impact of one specific intervention in customer journey and get the clean causal effect. However, the literature on experiment and causal inference lacks a holistic approach to optimize a sequence of interventions along customer journey. Specifically, locally optimal interventions unveiled by the one-shot experiments might be globally sub-optimal when considering the interdependence among themselves as well as the long term reward along the customer journey. Luckily, the accumulation of large number of historical experiments creates and trails various exogenous interventions at different stages of customers' path-to-purchase and provides a new opportunity. In this paper, we integrate historical experiments with Reinforcement Learning (RL) algorithm to tackle the question that cannot be answered by standalone one-shot experiments: how to identify optimal sequence of interventions along customers' path-to-purchase using the ensemble of experiments? We proposed a Bayesian Deep Recurrent Q Network (BDRQN) model that can leverage the exogenous interventions within the historical experiment data to learn the effectiveness of interventions at different stages of customer journey and optimize them for the long-term reward. The Bayesian approach empowers the proposed model by not only identifying the long-term reward of various interventions but also estimating the distribution of those expected rewards. Thus, beyond optimization within the existing experiments and data, the BDRQN model framework and resulted estimation can also guide the allocation of future experiments along the customer journey to those high potential but uncertain interventions. In summary, the proposed RL+AB approach can create a two-way complementarity between RL and field experiments thus provide a holistic approach to optimize customer journey.

Keywords: Randomized experiments, Customer Journey, Reinforcement Learning, Optimization, Bayesian Deep Recurrent Q Network Model, Experiment Design

Suggested Citation

Song, Yicheng and Sun, Tianshu, Ensembling Experiments to Optimize Interventions along Customer Journey: A Reinforcement Learning Approach (October 8, 2021). Available at SSRN: https://ssrn.com/abstract=3939073

Yicheng Song (Contact Author)

University of Minnesota - Twin Cities - Carlson School of Management ( email )

19th Avenue South
Minneapolis, MN 55455
United States

HOME PAGE: http://people.bu.edu/ycsong/

Tianshu Sun

University of Southern California - Marshall School of Business ( email )

3670 Trousdale Parkway
Bridge Hall 310B
Los Angeles, CA 90089
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Abstract Views
170
PlumX Metrics