Batched Bandit Problems
26 Pages Posted: 31 Oct 2015
Date Written: October 29, 2015
Abstract
Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. Our results show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.
Keywords: Multi-armed bandit problems, regret bounds, batches, multi-phase allocation, grouped clinical trials, sample size determination, switching cost
Suggested Citation: Suggested Citation
Perchet, Vianney and Rigollet, Philippe and Chassang, Sylvain and Snowberg, Erik, Batched Bandit Problems (October 29, 2015). Princeton University William S. Dietrich II Economic Theory Center Research Paper No. 074_2015, Available at SSRN: https://ssrn.com/abstract=2683578 or http://dx.doi.org/10.2139/ssrn.2683578
Do you have negative results from your research you’d like to share?
Feedback
Feedback to SSRN
If you need immediate assistance, call 877-SSRNHelp (877 777 6435) in the United States, or +1 212 448 2500 outside of the United States, 8:30AM to 6:00PM U.S. Eastern, Monday - Friday.