Phase Transitions in Bandits with Switching Constraints

Management Science (to appear)

111 Pages Posted: 6 Jun 2019 Last revised: 26 Aug 2022

See all articles by David Simchi-Levi

David Simchi-Levi

Massachusetts Institute of Technology (MIT) - School of Engineering

Yunzong Xu

Massachusetts Institute of Technology (MIT)

Date Written: April 30, 2019

Abstract

We consider the classical stochastic multi-armed bandit problem with a constraint that limits the total cost incurred by switching between actions to be no larger than a given switching budget. For this problem, we prove matching upper and lower bounds on the optimal (i.e., minimax) regret, and provide efficient rate-optimal algorithms. Surprisingly, the optimal regret of this problem exhibits a non-conventional growth rate in terms of the time horizon and the number of arms. Consequently, we discover surprising "phase transitions" regarding how the optimal regret rate changes with respect to the switching budget: when the number of arms is fixed, there are equal-length phases, where the optimal regret rate remains (almost) the same within each phase and exhibits abrupt changes between phases; when the number of arms grows with the time horizon, such abrupt changes become subtler and may disappear, but a generalized notion of phase transitions involving certain new measurements still exists. The results enable us to fully characterize the trade-off between the regret rate and the incurred switching cost in the stochastic multi-armed bandit problem, contributing new insights to this fundamental problem. Under the general switching cost structure, the results reveal interesting connections between bandit problems and graph traversal problems, such as the shortest Hamiltonian path problem.

Keywords: multi-armed bandit, switching constraint, limited switches, phase transition, regret analysis, minimax lower bound, graph traversal, revenue management, dynamic pricing

Suggested Citation

Simchi-Levi, David and Xu, Yunzong, Phase Transitions in Bandits with Switching Constraints (April 30, 2019). Management Science (to appear), Available at SSRN: https://ssrn.com/abstract=3380783 or http://dx.doi.org/10.2139/ssrn.3380783

David Simchi-Levi

Massachusetts Institute of Technology (MIT) - School of Engineering ( email )

MA
United States

Yunzong Xu (Contact Author)

Massachusetts Institute of Technology (MIT) ( email )

77 Massachusetts Avenue
50 Memorial Drive
Cambridge, MA 02139-4307
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
483
Abstract Views
2,248
Rank
109,387
PlumX Metrics