43 Pages Posted: 16 Jan 2016
Date Written: January 14, 2016
We consider a seller who repeatedly sells a nondurable product to a single customer whose valuations of the product are drawn from a certain distribution. The seller, who initially does not know the valuation distribution, may use the customer's purchase history to learn, and wishes to choose a pricing policy that maximizes her long-run revenue. Such a problem is at the core of personalized revenue management where the seller can access each customer's individual purchase history and offer personalized prices.
In this paper, we study such a learning problem when the customer is aware of the seller's policy, and thus may behave strategically when making a purchase decision. By using a Bayesian setting with a binary prior, we first show that a naive myopic Bayesian policy (MBP) by the seller may lead to incomplete learning -- the seller may never be able to ascertain the true type of the customer and the regret may grow linearly in time. The failure of the MBP is due to the strategic action taken by the customer. To resolve this issue, we propose a randomized Bayesian policy (RBP), which updates the posterior belief of the customer in each period with a certain probability. We show that the seller can learn the customer type exponentially fast with the RBP even if the customer is strategic, and the regret is bounded by a constant. We also propose policies that achieve asymptotically optimal regrets when only a finite number of price changes is allowed.
Keywords: revenue management; Bayesian learning; strategic customers; pricing
JEL Classification: D4
Suggested Citation: Suggested Citation
Chen, Xi and Wang, Zizhuo, Bayesian Dynamic Learning and Pricing with Strategic Customers (January 14, 2016). Available at SSRN: https://ssrn.com/abstract=2715730 or http://dx.doi.org/10.2139/ssrn.2715730