Blind Network Revenue Management and Bandits with Knapsacks under Limited Switches

88 Pages Posted: 6 Dec 2019 Last revised: 2 Jan 2021

See all articles by David Simchi-Levi

David Simchi-Levi

Massachusetts Institute of Technology (MIT) - School of Engineering

Yunzong Xu

Massachusetts Institute of Technology (MIT)

Jinglong Zhao

Boston University - Questrom School of Business

Date Written: November 1, 2019

Abstract

Our work is motivated by a common business constraint in online markets. While firms respect the advantages of dynamic pricing and price experimentation, they must limit the number of price changes (i.e., switches) to be within some budget due to various practical reasons. We study both the classical price-based network revenue management problem in the distributionally-unknown setup, and the bandits with knapsacks problem. In these problems, a decision-maker (without prior knowledge of the environment) has finite initial inventory of multiple resources to allocate over a finite time horizon. Beyond the classical resource constraints, we introduce an additional switching constraint to these problems, which restricts the total number of times that the decision-maker makes switches between actions to be within a fixed switching budget. For such problems, we show matching upper and lower bounds on the optimal regret, and propose computationally-efficient limited-switch algorithms that achieve the optimal regret. Our work reveals a surprising result: the optimal regret rate is completely characterized by a piecewise-constant function of the switching budget, which further depends on the number of resource constraints --- to the best of our knowledge, this is the first time the number of resources constraints is shown to play a fundamental role in determining the statistical complexity of online learning problems. We conduct computational experiments to examine the performance of our algorithms on a numerical setup that is widely used in the literature. Compared with benchmark algorithms from the literature, our proposed algorithms achieve promising performance with clear advantages on the number of incurred switches. Practically, firms can benefit from our study and improve their learning and decision-making performance when they simultaneously face resource and switching constraints.

Keywords: Network Revenue Management, Bandits With Knapsacks, Online Learning, Limited Switches

Suggested Citation

Simchi-Levi, David and Xu, Yunzong and Zhao, Jinglong, Blind Network Revenue Management and Bandits with Knapsacks under Limited Switches (November 1, 2019). Available at SSRN: https://ssrn.com/abstract=3479477 or http://dx.doi.org/10.2139/ssrn.3479477

David Simchi-Levi

Massachusetts Institute of Technology (MIT) - School of Engineering ( email )

MA
United States

Yunzong Xu (Contact Author)

Massachusetts Institute of Technology (MIT) ( email )

77 Massachusetts Avenue
50 Memorial Drive
Cambridge, MA 02139-4307
United States

Jinglong Zhao

Boston University - Questrom School of Business ( email )

595 Commonwealth Avenue
Boston, MA MA 02215
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
328
Abstract Views
2,156
Rank
168,274
PlumX Metrics