Decision Making with Machine Learning and ROC Curves
52 Pages Posted: 30 May 2019
Date Written: May 5, 2019
The Receiver Operating Characteristic (ROC) curve is a representation of the statistical information discovered in binary classification problems and is a key concept in machine learning and data science. This paper studies the statistical properties of ROC curves and its implication on model selection. We analyze the implications of different models of incentive heterogeneity and information asymmetry on the relation between human decisions and the ROC curves. Our theoretical discussion is illustrated in the context of a large data set of pregnancy outcomes and doctor diagnosis from the Pre-Pregnancy Checkups of reproductive age couples in Henan Province provided by the Chinese Ministry of Health.
Keywords: ROC Curve, Binary Classification, Neyman Pearson Lemma, Incentive Heterogeneity, Information Asymmetry
JEL Classification: C44, C45, D81
Suggested Citation: Suggested Citation