Optimal Exploration

35 Pages Posted: 13 Sep 2018

Date Written: September 13, 2018


Consider a decision maker who has to choose one of several alternatives, and who is imperfectly informed about the payoff of each of them. In each period, the decision maker has to decide whether to stop and take one of the alternatives, or to continue researching the alternatives. New information is costly and is never conclusive. We provide a dynamic programming formulation of the decision maker’s problem with either a finite deadline or no deadline, and give necessary and sufficient conditions for research to take place for some prior beliefs about the alternatives. We show that, at least for short deadlines, the decision maker either explores the best alternative and stops after good news, or explores the second best alternative and stops after bad news, with the former path being optimal if the decision maker is relatively optimistic about the payoff of the alternatives.

Keywords: optimal sequencing of experimentation, multi-armed bandit problem, bandits, Pandora's Box, Sequential sampling

JEL Classification: C41, C61

Suggested Citation

Austen-Smith, David and Martinelli, César, Optimal Exploration (September 13, 2018). GMU Working Paper in Economics No. 18-25. Available at SSRN: https://ssrn.com/abstract=3249069 or http://dx.doi.org/10.2139/ssrn.3249069

David Austen-Smith

Kellogg School of Management, Northwestern University ( email )

2001 Sheridan Road
Evanston, IL 60208
United States

César Martinelli (Contact Author)

George Mason University - Interdisciplinary Center for Economic Science (ICES) ( email )

400P Truland Building
George Mason University
Fairfax, VA 22030
United States

Register to save articles to
your library


Paper statistics

Abstract Views
PlumX Metrics