Cost-Aware Calibration of Classifiers
23 Pages Posted: 21 Nov 2022
Date Written: November 7, 2022
Abstract
Most classification techniques in machine learning are able to produce probability predictions in addition to class predictions. However, these predicted probabilities are often not well-calibrated, in that they deviate from the actual outcome rates (i.e., the proportion of data instances that actually belong to a certain class). A lack of calibration can jeopardize downstream decision tasks that rely on accurate probability predictions. Although several post-hoc calibration methods have been proposed, they generally do not consider the potentially asymmetric costs associated with over- vs. under-prediction. In this research note, we formally define the problem of cost-aware calibration and propose a metric to quantify the cost of miscalibration for a given classifier. Next, we propose three approaches to achieve cost-aware calibration, two of which are cost-aware adaptations of existing calibration algorithms and the third one (named MetaCal) is a Bayes optimal learning algorithm inspired by prior work on cost-aware classification. We carry out systematic empirical evaluations of the proposed approaches and find that MetaCal is able to consistently and significantly outperform alternative approaches on multiple public datasets. Finally, we generalize the definition, metric, as well as solution algorithms of cost-aware calibration to account for nonlinear cost structures that may arise in real-world decision tasks.
Keywords: machine learning, classification, probability prediction, calibration, cost-aware learning, prescriptive analytics
Suggested Citation: Suggested Citation