Active Feature-Value Acquisition
33 Pages Posted: 16 Nov 2006
Date Written: July 18, 2006
Most induction algorithms for building predictive models take as input training data in the form of feature vectors. Acquiring the values of features may be costly, and simply acquiring all values may be wasteful, or even prohibitively expensive. Active feature-value acquisition (AFA) elects features incrementally in an attempt to improve the predictive model most cost-effectively. This paper presents a framework for AFA based on estimating information value. While straightforward in principle, estimations and approximations must be made to apply the framework in practice. We present an acquisition policy, Sampled Expected Utility (SEU), that employs particular estimations to enable effective ranking of potential acquisitions in settings where relatively little information is available about the underlying domain. We then present experimental results showing that, as compared to the policy of using representative sampling for feature acquisition, sampled expected utility indeed reduces the cost of producing a model of a desired accuracy and exhibits consistent performance across domains. We also show that we can improve considerably over a recently published policy for instance completion, a special case of AFA. Finally, we demonstrate additional promise of the expected utility framework by applying it to the even more general modeling setting in which feature values as well as class labels may be missing and are costly to acquire. This is done by treating the class label as an additional feature, thus combining the settings of AFA and traditional active learning.
Keywords: Information acquistion, predicitve modeling
Suggested Citation: Suggested Citation