UCB-Type Learning Algorithms with Kaplan-Meier Estimator for Lost-Sales Inventory Models with Lead Times

Forthcoming in Operations Research

75 Pages Posted: 18 Oct 2021 Last revised: 13 Feb 2024

See all articles by Chengyi Lyu

Chengyi Lyu

University of Colorado at Boulder - Leeds School of Business

Huanan Zhang

University of Colorado at Boulder - Leeds School of Business

Linwei Xin

University of Chicago - Booth School of Business

Date Written: April 23, 2023

Abstract

In this paper, we consider a classic periodic-review lost-sales inventory system with lead times, which is notoriously challenging to optimize with a wide range of real-world applications. We consider a joint learning and optimization problem in which the decision-maker does not know the demand distribution a priori and can only use past sales information (i.e., censored demand). Departing from existing learning algorithms on this learning problem (e.g., Huh et al. 2009a, Agrawal and Jia 2019, Zhang et al. 2020) that require the convexity property of the underlying system, we develop an Upper Confidence Bound (UCB)-type learning framework that incorporates simulations with the Kaplan-Meier estimator, and demonstrate its applicability to learning not only the optimal capped base-stock policy in which convexity no longer holds but also the optimal base-stock policy with a regret that matches the best existing result. Compared with a classic multiarmed bandit problem, our problem has unique challenges due to the nature of the inventory system, because (1) each action has long-term impacts on future costs, and (2) the system state space is exponentially large in the lead time. As such, our learning algorithms are not naive adoptions of the classic UCB algorithm; in fact, the design of the simulation steps with the Kaplan-Meier estimator and averaging steps is novel in our algorithms, and the confidence width in the UCB index is also different from the classic one. We prove the regrets of our learning algorithms are tight, up to a logarithmic term, in the planning horizon T. Our extensive numerical experiments suggest the proposed algorithms (almost) dominate existing learning algorithms. We also demonstrate how to select which learning algorithm to use with limited demand data.

Keywords: inventory, lost sales, lead time, censored demand, learning algorithm, capped base-stock policy, Kaplan-Meier estimator

Suggested Citation

Lyu, Chengyi and Zhang, Huanan and Xin, Linwei, UCB-Type Learning Algorithms with Kaplan-Meier Estimator for Lost-Sales Inventory Models with Lead Times (April 23, 2023). Forthcoming in Operations Research, Available at SSRN: https://ssrn.com/abstract=3944354 or http://dx.doi.org/10.2139/ssrn.3944354

Chengyi Lyu

University of Colorado at Boulder - Leeds School of Business ( email )

CO 80309
United States

Huanan Zhang

University of Colorado at Boulder - Leeds School of Business ( email )

Boulder, CO 80309-0419
United States

Linwei Xin (Contact Author)

University of Chicago - Booth School of Business ( email )

5807 S. Woodlawn Avenue
Chicago, IL 60637
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
335
Abstract Views
1,183
Rank
161,349
PlumX Metrics