Nonparametric Learning Algorithms for Joint Pricing and Inventory Control with Lost-Sales and Censored Demand
44 Pages Posted: 10 Sep 2016 Last revised: 23 Mar 2020
Date Written: October 1, 2017
We consider a joint pricing and inventory control problem in which the customer’s response to selling price and the demand distribution are not known a priori. Unsatisfied demand is lost and unobserved, and the only available information for decision-making is the observed sales data (a.k.a. censored demand). Conventional approaches, such as stochastic approximation, online convex optimization, and continuum-armed bandit algorithms, cannot be employed since neither the realized values of the profit function nor its derivatives are known. A major challenge of this problem lies in that the estimated profit function constructed from observed sales data is multimodal in price. We develop a nonparametric spline approximation based learning algorithm. The algorithm separates the planning horizon into a disjoint exploration phase and an exploitation phase. During the exploration phase, the price space is discretized, and each price is offered an equal number of periods together with a pre-specified target inventory level. Based on the sales data collected on these prices, a spline approximation of the demand-price function is constructed, and then the corresponding surrogate optimization problem is solved on a sparse grid to obtain a pair of recommended price and target inventory level. During the exploitation phase, the algorithm implements the recommended strategies. We establish a (nearly) square-root regret rate, which (almost) matches the theoretical lower bound.
Keywords: algorithm, joint pricing and inventory control, lost-sales, censored demand, nonparametric
Suggested Citation: Suggested Citation