Variance-Based Active Learning
New York University (NYU)
New York University
NYU Working Paper No. 2451/14167
For many supervised learning tasks, the cost of acquiringtraining data is dominated by the cost of class labeling.In this work, we explore active learning forclass probability estimation (CPE). Active learning acquiresdata incrementally, using the model learned sofar to help identify especially useful additional data forlabeling. We present a new method for active learning,BootstrapLV, which chooses new data based onthe variance in probability estimates from bootstrapsamples. We then show empirically that the methodreduces the number of data items that must be labeled,across a wide variety of data sets. We also compareBootstrap-LV with Uncertainty Sampling, an existingactive-learning method for maximizing classificationaccuracy, and show not only that BootstrapLV dominatesfor CPE but also that it is quite competitive evenfor accuracy maximization.
Number of Pages in PDF File: 7working papers series
Date posted: October 13, 2008
© 2013 Social Science Electronic Publishing, Inc. All Rights Reserved.
This page was processed by apollo6 in 0.641 seconds