Woodroofe's One-Armed Bandit Problem Revisited

Annals of Applied Probability, Vol. 19, No. 4, pp. 1603-1633, 2009

31 Pages Posted: 21 Oct 2011

See all articles by Assaf Zeevi

Assaf Zeevi

Columbia University - Columbia Business School, Decision Risk and Operations

Alexander Goldenshluger

National Research University Higher School of Economics (Moscow); University of Haifa

Date Written: 2009

Abstract

We consider the one-armed bandit problem of Woodroofe [J. Amer. Statist. Assoc. 74 (1979) 799-806], which involves sequential sampling from two populations: one whose characteristics are known, and one which depends on an unknown parameter and incorporates a covariate. The goal is to maximize cumulative expected reward. We study this problem in a minimax setting, and develop rate-optimal polices that involve suitable modifications of the myopic rule. It is shown that the regret, as well as the rate of sampling from the inferior population, can be finite or grow at various rates with the time horizon of the problem, depending on "local" properties of the covariate distribution. Proofs rely on martingale methods and information theoretic arguments.

Suggested Citation

Zeevi, Assaf and Goldenshluger, Alexander, Woodroofe's One-Armed Bandit Problem Revisited (2009). Annals of Applied Probability, Vol. 19, No. 4, pp. 1603-1633, 2009 , Available at SSRN: https://ssrn.com/abstract=1946498

Assaf Zeevi (Contact Author)

Columbia University - Columbia Business School, Decision Risk and Operations ( email )

New York, NY
United States
212-854-9678 (Phone)
212-316-9180 (Fax)

HOME PAGE: http://www.gsb.columbia.edu/faculty/azeevi/

Alexander Goldenshluger

National Research University Higher School of Economics (Moscow)

Myasnitskaya street, 20
Moscow, Moscow 119017
Russia

University of Haifa

Mount Carmel
Haifa, 31905
Israel

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
30
Abstract Views
655
PlumX Metrics