An Efficient Learning Framework for Multi-Product Inventory Systems with Customer Choices

47 Pages Posted: 18 Feb 2021 Last revised: 17 Mar 2023

See all articles by Xiangyu Gao

Xiangyu Gao

The Chinese University of Hong Kong (CUHK) - Department of Decision Sciences & Managerial Economics

Huanan Zhang

University of Colorado at Boulder - Leeds School of Business

Date Written: January 29, 2021

Abstract

In this paper, we first introduce a periodic-review multi-product inventory system where each customer's demand is affected by the product availabilities and the customer's preference. As customer preferences are not directly observable and hard to estimate, when the full distributional information of the demand is not available, the decision-maker has to learn the information on-the-fly, through the partial and censored feedback of customers. For this learning problem, if one ignores the inventory dynamic and simply treat this as a Multi-Armed Bandit problem and directly applies some existing algorithms, e.g., the Upper Confidence Bound (UCB) algorithm, the convergence can be extremely slow due to the high-dimensionality of the policy space. We propose a UCB-based learning framework that utilizes the demand information based on two improvement ideas. We illustrate how these two ideas can be incorporated by considering two specific systems: 1) multi-product inventory system with stock-out substitutions, 2) multi-product inventory assortment problem for urban warehouses. We develop improved UCB algorithms for both systems, using the two improvements. For both systems, the algorithm can achieve a tight worst-case convergence rate (up to a logarithmic term) on the planning horizon T. Extensive numerical experiments are conducted to demonstrate the efficiency of the improved UCB algorithms for the two systems. In the experiments, when there are more than 1000 candidate policies to choose from, the algorithms can achieve around 15% average expected regret within 50 periods and continues to steadily improve as time increases.

Keywords: Inventory management, Online learning, Customer choices

Suggested Citation

Gao, Xiangyu and Zhang, Huanan, An Efficient Learning Framework for Multi-Product Inventory Systems with Customer Choices (January 29, 2021). Available at SSRN: https://ssrn.com/abstract=3775303 or http://dx.doi.org/10.2139/ssrn.3775303

Xiangyu Gao (Contact Author)

The Chinese University of Hong Kong (CUHK) - Department of Decision Sciences & Managerial Economics ( email )

Shatin, N.T.
Hong Kong

Huanan Zhang

University of Colorado at Boulder - Leeds School of Business ( email )

Boulder, CO 80309-0419
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
136
Abstract Views
476
Rank
323,615
PlumX Metrics