39 Pages Posted: 29 Feb 2016 Last revised: 10 Dec 2016
Date Written: February 23, 2016
We consider the problem faced by a firm that receives highly differentiated products in an online fashion and needs to price them in order to sell them to its customer base. Products are described by vectors of features and the market value of each product is linear in the values of the features. The firm does not initially know the values of the different features, but it can learn the values of the features based on whether products were sold at the posted prices in the past. This model is motivated by a question in online advertising, where impressions arrive over time and can be described by vectors of features. We first consider a multi-dimensional version of binary search over polyhedral sets, and show that it has exponential worst-case regret. We then propose a modification of the prior algorithm where uncertainty sets are replaced by their Lowner-John ellipsoids. We show that this algorithm has a worst-case regret that is quadratic in the dimensionality of the feature space and logarithmic in the time horizon. We also show how to adapt our algorithm to the case where valuations are noisy by using a technique called shallow cuts. Finally, we present computational experiments to illustrate the performance of our algorithm.
Keywords: Multi-armed bandits, contextual bandits, ellipsoid method, online advertising
JEL Classification: C61, D42, D81, D83
Suggested Citation: Suggested Citation
Cohen, Maxime C. and Lobel, Ilan and Paes Leme, Renato, Feature-Based Dynamic Pricing (February 23, 2016). Available at SSRN: https://ssrn.com/abstract=2737045