Dynamic Pricing With Infrequent Inventory Replenishments
33 Pages Posted: 14 Oct 2022
Date Written: October 6, 2022
We consider a joint pricing and inventory control problem where pricing can be adjusted more frequently, such as every period, than inventory ordering decisions, which are made every epoch that consists of multiple periods. This is motivated by many examples, especially for online retailers, where price is indeed much easier to change than inventory level, because changing the latter is subject to logistic and capacity constraints. In this setting, the retailer determines the inventory level at the beginning of each epoch and solves a dynamic pricing problem within each epoch with no further replenishment opportunities. The optimal pricing and inventory control policy is characterized by an intricate dynamic programming (DP) solution. We consider the situation where the demand-price function and the distribution of random demand noise are both unknown to the retailer, and the retailer needs to develop an online learning algorithm to learn those information and at the same time maximize total profit. We propose a learning algorithm based on least squares estimation and construction of an empirical noise distribution under a UCB framework and prove that the algorithm converges through the DP recursions to approach the optimal pricing and inventory control policy under complete demand information. The theoretical lower bound for convergence rate of a learning algorithm is proved based on the multivariate Van Trees inequality coupled with some structural DP analyses, and we show that the upper bound of our algorithm's convergence rate matches the theoretical lower bound.
Keywords: Dynamic pricing, infrequent inventory replenishment, online demand learning, convergence analyses
Suggested Citation: Suggested Citation