Regret Minimization with Dynamic Benchmarks in Repeated Games

51 Pages Posted: 20 Dec 2022 Last revised: 2 Jan 2023

See all articles by Ludovico Crippa

Ludovico Crippa

Stanford Graduate School of Business

Yonatan Gur

Stanford Graduate School of Business

Bar Light

Microsoft Research, NYC

Date Written: December 6, 2022

Abstract

In repeated games, strategies are often evaluated by their ability to guarantee the performance of the single best action that is selected in hindsight (a property referred to as Hannan consistency, or no-regret). However, the effectiveness of the single best action as a yardstick to evaluate strategies is limited, as any static action may perform poorly in common dynamic settings. We propose the notion of dynamic benchmark consistency, which requires a strategy to asymptotically guarantee the performance of the best dynamic sequence of actions selected in hindsight subject to a constraint on the number of action changes the corresponding dynamic benchmark admits. We show that dynamic benchmark consistent strategies exist if and only if the number of changes in the benchmark scales sublinearly with the horizon length. Further, our main result establishes that the set of empirical joint distributions of play that may emerge, when all players deploy such strategies, asymptotically coincides with the set of Hannan equilibria (also referred to as coarse correlated equilibria) of the stage game. This general characterization allows one to leverage analyses developed for frameworks that consider static benchmarks, which we demonstrate by bounding the social efficiency of the possible outcomes in our setting. Together, our results imply that dynamic benchmark consistent strategies introduce the following Pareto-type improvement over no-regret strategies: They enable stronger individual guarantees against arbitrary strategies of the other players, while maintaining the same worst-case guarantees on the social welfare, when all players adopt these strategies.

Keywords: Repeated Games, Incomplete Information, No Regret, Price of Anarchy

Suggested Citation

Crippa, Ludovico and Gur, Yonatan and Light, Bar, Regret Minimization with Dynamic Benchmarks in Repeated Games (December 6, 2022). Available at SSRN: https://ssrn.com/abstract=4295141 or http://dx.doi.org/10.2139/ssrn.4295141

Ludovico Crippa

Stanford Graduate School of Business ( email )

655 Knight Way
Stanford, 94305
United States

Yonatan Gur (Contact Author)

Stanford Graduate School of Business ( email )

655 Knight Way
Stanford, CA 94305-5015
United States

Bar Light

Microsoft Research, NYC ( email )

NYC, CA
United States

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
66
Abstract Views
488
Rank
512,968
PlumX Metrics