Generalizability: Machine Learning and Humans-in-the-Loop
RESEARCH HANDBOOK ON BIG DATA LAW (Roland Vogl, ed., Edward Elgar, 2020 Forthcoming)
21 Pages Posted: 10 Jul 2019 Last revised: 15 Jun 2020
Date Written: December 17, 2019
Abstract
Automated decision tools, which increasingly rely on machine learning (ML), are used in decision systems that permeate our lives. Examples range from high-stakes decision systems for offering credit, university admissions and employment, to decision systems serving advertising. Here, we consider data-driven tools that attempt to predict likely behavior of individuals. The debate about ML-based decision-making has spawned an important multi-disciplinary literature, which has focused primarily on fairness, accountability, and transparency. We have been struck, however, by the lack of attention to generalizability in the scholarly and policy discourse about whether and how to incorporate automated decision tools into decision systems.
This chapter explores the relationship between generalizability and the division of labor between humans and machines in decision systems. An automated decision tool is generalizable to the extent that it produces outputs that are as correct as the outputs it produced on the data used to create it. The generalizability of a ML model depends on the training process, data availability, and the underlying predictability of the outcome that it models. Ultimately, whether a tool’s generalizability is adequate for a particular decision system depends on how it is deployed, usually in conjunction with human adjudicators. Taking generalizability explicitly into account highlights important aspects of decision system design, as well as important normative trade-offs, that might otherwise be missed.
Keywords: machine learning, artificial intelligence, prediction, validation, automated decision-making, policy, rules, standards, big data, law, legal
Suggested Citation: Suggested Citation