Learning Customer Preferences from Personalized Assortments
54 Pages Posted: 7 Aug 2018
Date Written: July 17, 2018
A company wishes to identify the most popular version of a product from a menu of alternative options. Unaware of customers' true preferences, the company relies on a feedback system that allows potential buyers to provide feedback on their preferred versions. Under a general ranking-based choice model framework, we study how to dynamically individualize the set of versions shown to each customer for them to provide feedback on. This allows the company to identify the top-ranked version with a fixed probabilistic confidence level using a minimal amount of feedback. We prove an instance-specific lower bound on the sample complexity and propose a sampling policy (Myopic Tracking Policy), which is both asymptotically optimal and intuitive to implement. Our methodology draws on previous work in the sequential design of experiments and best arm identification. We illustrate our methodology using a special class of choice models based on Luce's (1959) attraction model and provide a simple closed-form solution that reveals a number of key properties of our proposed Myopic Tracking policy.
Keywords: sequential learning, maximum selection, best arm identification, personalized assortment planning
Suggested Citation: Suggested Citation