Adaptive Learning in Finitely Repeated Games
41 Pages Posted: 20 Apr 2021
Date Written: April 20, 2021
This paper investigates the way in which adaptive players behave in the long run in finitely repeated games. Each player assigns subjective payoff assessments to his own actions and chooses the action which has the highest assessment at each of his information sets. After receiving payoffs, players update their own assessments of chosen actions using the realized payoffs in an adaptive manner; we consider the updating rules by Watkins and Dayan (1992) and Sarin and Vahid (1999). When players experience random shocks on their assessments, players’ behavior strategies converge to a unique agent quantal response equilibrium (McKelvey and Palfrey, 1998) if the finitely repeated game has perfect information. For more general cases, we provide an additional condition to guarantee convergence. When players do not experience the shocks, (1) in Selten’s finitely repeated chain store game, at each round (i) a local store owner decides not to enter or (ii) he decides to enter and the chain store manager decides to accommodate; (2) in the finitely repeated prisoner’s dilemma, both players may end up cooperating at each round; and (3) in a finitely repeated coordination game, both players end up coordinating at each round.
Keywords: Adaptive learning in games, Q-Learning, Finitely repeated games, Coordination games, Finitely repeated prisoner’s dilemma, Agent quantal response equilibrium, Self-confirming equilibrium, Asynchronous stochastic approximation
JEL Classification: D83, C72, C73, D81
Suggested Citation: Suggested Citation