Last-iterate Convergence in No-regret Learning: Games with Reference Effects Under Logit Demand
88 Pages Posted: 7 Nov 2023 Last revised: 11 Mar 2025
Date Written: October 10, 2023
Abstract
This work is dedicated to the algorithm design in an oligopoly price competition, with the primary goal of examining the long-run market behavior. We consider a realistic setting where n firms engage in a multi-period price competition within a partial information setting---each firm can only access its first-order feedback and lacks information about its competitors---under the influence of reference effects. Consumers assess their willingness to pay by comparing the current price against the memory-based reference price, and their choices follow the multinomial logit (MNL) model. We use the notion of stationary Nash equilibrium (SNE), defined as the fixed point of the equilibrium pricing policy, to simultaneously capture the long-run equilibrium and stability. With loss-neutral reference effects, we propose the online projected gradient ascent (OPGA) algorithm, where each firm adjusts the price using the first-order derivatives of its log-revenues, accessible through the market feedback mechanism. Despite the absence of typical properties required for the convergence of online games, such as strong monotonicity and variational stability, we demonstrate that under diminishing step-sizes, the price and reference price paths generated by the OPGA attain last-iterate convergence to the unique SNE, and thereby guarantee the no-regret learning and market stability. Moreover, with appropriate step-sizes, we prove that this algorithm exhibits a convergence rate of O(1/t^2) and achieves a constant dynamic regret. The inherent asymmetry nature of reference effects motivates the exploration beyond loss-neutrality. When loss-averse reference effects are introduced, we propose the conservative-OPGA (C-OPGA) algorithm to handle the non-smooth revenue functions and show that the price and reference price achieve last-iterate convergence to the set of SNEs with the rate of O(1/\sqrt{t}). Finally, we demonstrate the practicality and robustness of OPGA and C-OPGA by theoretically showing that these algorithms can also adapt to firm-differentiated step-sizes and inexact gradients.
Keywords: last-iterate convergence, price competition, reference effect, multinomial logit model
Suggested Citation: Suggested Citation