Adaptive Learning in Finitely Repeated Games

41 Pages Posted: 20 Apr 2021

Date Written: April 20, 2021


This paper investigates the way in which adaptive players behave in the long run in finitely repeated games. Each player assigns subjective payoff assessments to his own actions and chooses the action which has the highest assessment at each of his information sets. After receiving payoffs, players update their own assessments of chosen actions using the realized payoffs in an adaptive manner; we consider the updating rules by Watkins and Dayan (1992) and Sarin and Vahid (1999). When players experience random shocks on their assessments, players’ behavior strategies converge to a unique agent quantal response equilibrium (McKelvey and Palfrey, 1998) if the finitely repeated game has perfect information. For more general cases, we provide an additional condition to guarantee convergence. When players do not experience the shocks, (1) in Selten’s finitely repeated chain store game, at each round (i) a local store owner decides not to enter or (ii) he decides to enter and the chain store manager decides to accommodate; (2) in the finitely repeated prisoner’s dilemma, both players may end up cooperating at each round; and (3) in a finitely repeated coordination game, both players end up coordinating at each round.

Keywords: Adaptive learning in games, Q-Learning, Finitely repeated games, Coordination games, Finitely repeated prisoner’s dilemma, Agent quantal response equilibrium, Self-confirming equilibrium, Asynchronous stochastic approximation

JEL Classification: D83, C72, C73, D81

Suggested Citation

Funai, Naoki, Adaptive Learning in Finitely Repeated Games (April 20, 2021). Available at SSRN: or

Naoki Funai (Contact Author)

Shiga University ( email )

Hikone, Shiga, 522-8522

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Abstract Views
PlumX Metrics