Thompson Sampling: Predicting Behavior in Games and Markets

65 Pages Posted: 31 Oct 2017 Last revised: 29 Nov 2019

Date Written: November 14, 2019


This paper proposes Thompson Sampling as a tractable theory of expectation formation across very different situations. Thompson Sampling accounts for learning behavior and for the fact that dispersion in expectations varies over time and across environments. Thompson Sampling means that agents, having limited information about their environments, update their subjective belief distributions in a Bayesian way, and subsequently make a random draw from the posterior. Conditional on that random draw, agents optimize. Thompson Sampling provides a unifying explanation for several empirical puzzles: (1) human behavior often differs from the predictions of the Nash equilibrium; (2) market volatility varies both over time and across markets; (3) there is much dispersion across individual forecasters in survey data. Thompson Sampling predicts human behavior better across very different datasets than commonly used theories of decision-making in economics such as Nash equilibrium, Bayesian learning with exogenous shocks, and quantal response equilibrium (QRE).

Keywords: Learning, bounded rationality, behavioral game theory, expectations, stochastic choice

JEL Classification: C91, C92, D84, E37

Suggested Citation

Mauersberger, Felix, Thompson Sampling: Predicting Behavior in Games and Markets (November 14, 2019). Available at SSRN: or

Felix Mauersberger (Contact Author)

University of Bonn ( email )


Register to save articles to
your library


Paper statistics

Abstract Views
PlumX Metrics