Dynamic Pricing and Inventory Control with Fixed Ordering Cost and Incomplete Demand Information

49 Pages Posted: 24 Jun 2020 Last revised: 18 Jan 2021

See all articles by Boxiao Chen

Boxiao Chen

University of Illinois at Chicago - College of Business Administration

David Simchi-Levi

Massachusetts Institute of Technology (MIT) - School of Engineering

Yining Wang

University of Florida - Warrington College of Business Administration

Yuan Zhou

University of Illinois at Urbana-Champaign

Date Written: June 21, 2020

Abstract

We consider the periodic review dynamic pricing and inventory control problem with fixed ordering cost. Demand is random and price dependent, and unsatisfied demand is backlogged. With complete demand information, the celebrated (s,S,p) policy is proved to be optimal, where s and S are the reorder point and order-up-to level for ordering strategy, and p, a function of on-hand inventory level, characterizes the pricing strategy. In this paper, we consider incomplete demand information and develop online learning algorithms whose average profit approaches that of the optimal (s,S,p) with a tight O ̃(√T) regret rate.

A number of salient features differentiate our work from the existing online learning researches in the OM literature. First, computing the optimal (s,S,p) policy requires solving a dynamic programming (DP) over multiple periods involving unknown quantities, which is different from the majority of learning problems in operations management that only require solving single-period optimization questions. It is hence challenging to establish stability results through DP recursions, which we accomplish by proving uniform convergence of the profit-to-go function. The necessity of analyzing action-dependent state transition over multiple periods resembles the reinforcement learning question, considerably more difficult than existing bandit learning algorithms. Second, the pricing function p is of infinite dimension, and approaching it is much more challenging than approaching a finite number of parameters as seen in existing researches. The demand-price relationship is estimated based on upper confidence bound, but the confidence interval cannot be explicitly calculated due to the complexity of the DP recursion. Finally, due to the multi-period nature of (s,S,p) policies the actual distribution of the randomness in demand plays an important role in determining the optimal pricing strategy p, which is unknown to the learner a priori. In this paper, the demand randomness is approximated by an empirical distribution constructed using dependent samples, and a novel Wasserstein metric based argument is employed to prove convergence of the empirical distribution.

Keywords: dynamic pricing, inventory control, fixed ordering cost, online learning, asymptotic analysis

Suggested Citation

Chen, Boxiao and Simchi-Levi, David and Wang, Yining and Zhou, Yuan, Dynamic Pricing and Inventory Control with Fixed Ordering Cost and Incomplete Demand Information (June 21, 2020). Available at SSRN: https://ssrn.com/abstract=3632475 or http://dx.doi.org/10.2139/ssrn.3632475

Boxiao Chen (Contact Author)

University of Illinois at Chicago - College of Business Administration ( email )

601 S Morgan St
Chicago, IL 60607
United States

David Simchi-Levi

Massachusetts Institute of Technology (MIT) - School of Engineering ( email )

MA
United States

Yining Wang

University of Florida - Warrington College of Business Administration ( email )

Gainesville, FL 32611
United States

Yuan Zhou

University of Illinois at Urbana-Champaign ( email )

Transportation Building
University of Illinois at Urbana-Champaign
Urbana, IL 61801
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
131
Abstract Views
592
rank
253,987
PlumX Metrics