Dynamic Assortment Planning Without Utility Parameter Estimation
40 Pages Posted: 8 Mar 2018
Date Written: March 2, 2018
We study a family of stylized dynamic assortment planning problems, where for each arriving customer, the seller offers an assortment of substitutable products and customer makes the purchase among offered products according to a discrete choice model. This paper considers two popular choice models --- the multinominal logit model (MNL) and nested logit model. Since all the utility parameters of customers are unknown, the seller needs to simultaneously learn customers' choice behavior and make dynamic decisions on assortments based on the current knowledge. The goal of the seller is to maximize the expected revenue, or equivalently, to minimize the worst-case expected regret. Although dynamic assortment planning problem has received an increasing attention in revenue management, most existing policies require the estimation of mean utility for each product and the final regret usually involves the number of products N. However, when the number of products N is large as compared to the horizon length T, the accurate estimation of mean utilities is extremely difficult. To deal with the large N case that is natural in many online applications, we propose new policies which completely avoid estimating the utility parameter for each product; and thus our regret is independent of N. In particular, for MNL model, we develop a dynamic trisection search algorithm that achieves the optimal regret (up to a log-factor). For nested logit model, we propose a lower and upper confidence bound algorithm with an aggregated estimation. There are two major advantages of the proposed policies. First, the regret of all our policies has no dependence on N. Second, our policies are almost assumption free: there is no assumption on mean utility nor any "separability'' condition on the expected revenues for different assortments. We also provide numerical results to demonstrate the empirical performance of the proposed methods.
Keywords: dynamic assortment optimization, regret analysis, lower and upper confidence bounds, nested logit models
Suggested Citation: Suggested Citation