Representativeness in Statistics, Politics, and Machine Learning
ACM Conference on Fairness, Accountability, and Transparency (FAccT) 2021
13 Pages Posted: 9 Feb 2021 Last revised: 24 Mar 2021
Date Written: January 10, 2021
Abstract
Representativeness is a foundational yet slippery concept. Though familiar at first blush, it lacks a single precise meaning. Instead, meanings range from typical or characteristic, to a proportionate match between sample and population, to a more general sense of accuracy, generalizability, coverage, or inclusiveness. Moreover, the concept has long been contested. In statistics, debates about the merits and methods of selecting a representative sample date back to the late 19th century; in politics, debates about the value of likeness as a logic of political representation are older still. Today, as the concept crops up in the study of fairness and accountability in machine learning, we need to carefully consider the term’s meanings in order to communicate clearly and account for their normative implications. In this paper, we ask what representativeness means, how it is mobilized socially, and what values and ideals it communicates or confronts. We trace the concept’s history in statistics and discuss normative tensions concerning its relationship to likeness, exclusion, authority, and aspiration. We draw on these analyses to think through how representativeness is used in FAccT debates, with emphasis on data, shift, participation, and power.
Keywords: representativeness, sampling, fairness, bias, participation, inclusion
JEL Classification: C18, C83, C81
Suggested Citation: Suggested Citation