Counter Intuitive Learning: An Exploratory Study
42 Pages Posted: 26 Sep 2016
Date Written: August 08, 2016
The literature on learning in unknown environments emphasises reinforcing on actions which produce positive results. But, in some cases, success requires shifting from a currently successful actions to others. We examine, experimentally and theoretically in a very simple framework, how individuals initially learn by exploiting information from the pay-offs of actions taken but also from exploring new actions. We analyse if and how they learn that pay-offs are inter-temporally dependent. We then ran the same experiments but where individuals could observe the actions taken or the pay-offs obtained by others or both. Such observations improved pay-offs if one of the pair had learned to obtain the maximum pay-off.
Keywords: multi-armed bandit, reinforcement learning, eureka moment, pay-off patterns, observational learning
JEL Classification: D810, D830
Suggested Citation: Suggested Citation