Non-Stationary Bandits with Auto-Regressive Temporal Dependency

Thirty-seventh Conference on Neural Information Processing Systems

45 Pages Posted: 9 Aug 2021 Last revised: 12 Dec 2023

See all articles by Qinyi Chen

Qinyi Chen

Massachusetts Institute of Technology (MIT) - Operations Research Center

Negin Golrezaei

Massachusetts Institute of Technology (MIT) - Sloan School of Management

Djallel Bouneffouf

IBM Research

Date Written: June 4, 2021

Abstract

Traditional multi-armed bandit (MAB) frameworks, predominantly examined under stochastic or adversarial settings, often overlook the temporal dynamics inherent in many real-world applications such as recommendation systems and online advertising. This paper introduces a novel non-stationary MAB framework that captures the temporal structure of these real-world dynamics through an auto-regressive (AR) reward structure. We propose an algorithm that integrates two key mechanisms: (i) an alternation mechanism adept at leveraging temporal dependencies to dynamically balance exploration and exploitation, and (ii) a restarting mechanism designed to discard out-of-date information. Our algorithm achieves a regret upper bound that nearly matches the lower bound, with regret measured against a robust dynamic benchmark. Finally, via a real-world case study on tourism demand prediction, we demonstrate both the efficacy of our algorithm and the broader applicability of our techniques to more complex, rapidly evolving time series.

Keywords: non-stationary bandits, autoregressive model, low-regret policy, online learning algorithms

Suggested Citation

Chen, Qinyi and Golrezaei, Negin and Bouneffouf, Djallel, Non-Stationary Bandits with Auto-Regressive Temporal Dependency (June 4, 2021). Thirty-seventh Conference on Neural Information Processing Systems, Available at SSRN: https://ssrn.com/abstract=3887608 or http://dx.doi.org/10.2139/ssrn.3887608

Qinyi Chen (Contact Author)

Massachusetts Institute of Technology (MIT) - Operations Research Center ( email )

77 Massachusetts Avenue
Bldg. E 40-149
Cambridge, MA 02139
United States

Negin Golrezaei

Massachusetts Institute of Technology (MIT) - Sloan School of Management ( email )

100 Main Street
E62-416
Cambridge, MA 02142
United States
02141 (Fax)

Djallel Bouneffouf

IBM Research ( email )

T. J. Watson Research Center
1 New Orchard Road
Armonk, NY 10504-1722
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
339
Abstract Views
1,282
Rank
186,110
PlumX Metrics