Learning to Ask the Right Questions: A Multi-armed Bandits Approach
77 Pages Posted: 6 Mar 2023
Date Written: March 1, 2023
Abstract
Adaptive testing where the hardness of the next question depends on the response of the candidate on prior questions is used in a variety of settings. In this paper, we study the problem of designing an optimal adaptive test with the goal to optimally classify a candidate’s ability into one of several categories or grades. The candidate’s ability is considered an unknown factor, which, combined with the hardness of the question, determines the chance of answering correctly. The learning algorithm is only able to observe whether the candidate answers a given question correctly or not. We consider this problem from a fixed confidence-based δ-correct framework. In our setting, this seeks to arrive at the correct grade of a given candidate at the fastest possible rate, i.e., the fewest number of questions asked while guaranteeing that the probability of error is less than a pre-specified and small δ. We derive a lower bound on the expected number of questions asked for any sequential questioning strategy, which is a solution to a min-max optimization problem. We develop geometrical insights into this optimization problem structure and its dual formulation. In addition, we propose an algorithm that essentially matches these lower bounds. Our key conclusions are that, asymptotically, any candidate needs to be asked questions at most at two (candidate ability-specific) hardness levels, although, in reasonably general settings on the problem structure the questions that need to be asked are at almost one hardness level. We also propose a related algorithm based on Gaussian approximation that performs well numerically and admits suitable δ-correct performance guarantees in an asymptotic regime.
Keywords: adaptive testing, multi-arm bandits, best arm identification, adaptive algorithms, Gaussian approximation
Suggested Citation: Suggested Citation