A Theoretical Analysis of Cooperative Behavior in Multi-Agent Q-Learning

23 Pages Posted: 7 Feb 2006

See all articles by Ludo Waltman

Ludo Waltman

Erasmus University Rotterdam - Faculty of Economics and Business

U. Kaymak

Erasmus University Rotterdam (EUR) - Faculty of Economics - Department of Computer Science; Erasmus Research Institute of Management (ERIM)

Date Written: February 1, 2006

Abstract

A number of experimental studies have investigated whether cooperative behavior may emerge in multi-agent Q-learning. In some studies cooperative behavior did emerge, in others it did not. This report provides a theoretical analysis of this issue. The analysis focuses on multi-agent Q-learning in iterated prisoner’s dilemmas. It is shown that under certain assumptions cooperative behavior may emerge when multi-agent Q-learning is applied in an iterated prisoner’s dilemma. An important consequence of the analysis is that multi-agent Q-learning may result in non-Nash behavior. It is found experimentally that the theoretical results derived in this report are quite robust to violations of the underlying assumptions.

Keywords: Cooperation, Multi-Agent Q-Learning, Multi-Agent Reinforcement Learning, Nash Equilibrium, Prisoner’s Dilemma

Suggested Citation

Waltman, Ludo and Kaymak, Uzay, A Theoretical Analysis of Cooperative Behavior in Multi-Agent Q-Learning (February 1, 2006). ERIM Report Series Reference No. ERS-2006-006-LIS, Available at SSRN: https://ssrn.com/abstract=880523

Ludo Waltman (Contact Author)

Erasmus University Rotterdam - Faculty of Economics and Business ( email )

P.O. Box 1738
3000 DR Rotterdam, NL 3062 PA
Netherlands
+31 10 408 1182 (Phone)
+31 10 408 9640 (Fax)

Uzay Kaymak

Erasmus University Rotterdam (EUR) - Faculty of Economics - Department of Computer Science ( email )

P.O. Box 1738
3000 DR Rotterdam
Netherlands

Erasmus Research Institute of Management (ERIM)

P.O. Box 1738
3000 DR Rotterdam
Netherlands

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
157
Abstract Views
1,483
rank
222,421
PlumX Metrics