Proxy Discrimination in the Age of Artificial Intelligence and Big Data
62 Pages Posted: 17 Apr 2019 Last revised: 9 Apr 2020
Date Written: August 5, 2019
Abstract
Big data and Artificial Intelligence (“AI”) are revolutionizing the ways in which firms, governments, and employers classify individuals. Surprisingly, however, one of the most important threats to antidiscrimination regimes posed by this revolution is largely unexplored or misunderstood in the extant literature. This is the risk that modern algorithms will result in “proxy discrimination.” Proxy discrimination is a particularly pernicious subset of disparate impact. Like all forms of disparate impact, it involves a facially-neutral practice that disproportionately harms members of a protected class. But a practice producing a disparate impact only amounts to proxy discrimination when the usefulness to the discriminator of the facially-neutral practice derives, at least in part, from the very fact that it produces a disparate impact. Historically, this occurred when a firm intentionally sought to discriminate against members of a protected class by relying on a proxy for class membership, such as zip code. However, proxy discrimination need not be intentional when membership in a protected class is predictive of a discriminator’s facially-neutral goal, making discrimination “rational.” In these cases, firms may unwittingly proxy discriminate, knowing only that a facially-neutral practice produces desirable outcomes. This Article argues that AI and big data are game changers when it comes to this risk of unintentional, but “rational,” proxy discrimination. AIs armed with big data are inherently structured to engage in proxy discrimination whenever they are deprived of information about membership in a legally-suspect class whose predictive power cannot be measured more directly by non-suspect data available to the AI. Simply denying AIs access to the most intuitive proxies for such predictive but suspect characteristics does little to thwart this process; instead it simply causes AIs to locate less intuitive proxies. For these reasons, as AIs become even smarter and big data becomes even bigger, proxy discrimination will represent an increasingly fundamental challenge to anti-discrimination regimes that seek to limit discrimination based on protected traits that often happen to be directly predictive characteristics. Numerous anti-discrimination regimes do just that, limiting discrimination based on potentially predictive factors like preexisting conditions, genetics, disability, sex, and even race. This Article offers a menu of potential strategies for combatting this risk of proxy discrimination by AI, including prohibiting the use of non-approved types of discrimination, mandating the collection and disclosure of data about impacted individuals’ membership in legally protected classes, and requiring firms to employ statistical models that isolate only the predictive power of non-suspect variables.
Keywords: Proxy Discrimination, Artificial Intelligence, Insurance, Big Data, GINA
Suggested Citation: Suggested Citation