Dynamic Assortment with Online Learning Under Threshold Multinomial Logit Model
53 Pages Posted: 30 May 2024 Last revised: 22 Oct 2024
Date Written: May 28, 2024
Abstract
Consumers often find themselves overwhelmed by extensive assortments offered by retailers and therefore may exhibit bounded rationality in their purchase decisions. However, existing literature on dynamic assortment optimization barely consider consumers' such bounded rational behavior. This motivates us to employ a simple but effective two-stage consider-then-choose model, namely the Threshold Multinomial Logit (TMNL) model to investigate the assortment optimization problem. The TMNL model characterizes consumers' endogenous consideration sets formation by the threshold effect. This endogenous dependency can capture more flexible substitution patterns than the classical MNL choice model, but it also creates great difficulties for online learning. In the offline assortment setting, we analyze the properties of optimal assortment and propose an efficient assortment optimization algorithm that outperforms the benchmark. In the online setting with unknown customer preferences and consideration set formation, we propose online learning algorithms that achieve nearly optimal regret bounds in both instance-independent and instance-dependent conditions. To the best of our knowledge, this is the first work to consider online assortment problems with consumers' endogenous consider-then-choose behavior. Moreover, our algorithm is extended to the contextual learning setting that effectively mitigates the impact of the number of products on performance. Extensive numerical experiments further validate the efficacy of our proposed algorithms.
Keywords: Online Learning, Threshold Effect, Consideration Set, Assortment Optimization, Bandit Algorithms
Suggested Citation: Suggested Citation