Discredited Data
60 Pages Posted: 28 Apr 2021 Last revised: 18 Aug 2023
Date Written: February 18, 2021
Abstract
Jurisdictions are increasingly employing pretrial algorithms as a solution to the racial and socioeconomic inequities in the bail system. But in practice, pretrial algorithms have reproduced the very inequities they were intended to correct. Scholars have diagnosed this problem as the biased data problem: pretrial algorithms generate racially and socioeconomically biased predictions, because they are constructed and trained with biased data.
This Article contends that biased data is not the sole cause of algorithmic discrimination. Another reason pretrial algorithms produce biased results is that they are exclusively built and trained with data from carceral knowledge sources – the police, pretrial services agencies, and the court system. Redressing this problem will require a paradigmatic shift away from carceral knowledge sources toward non-carceral knowledge sources. This Article explores knowledge produced by communities most impacted by the criminal legal system (“community knowledge sources”) as one category of non-carceral knowledge sources worth utilizing. Though data derived from community knowledge sources have traditionally been discredited and excluded in the construction of pretrial algorithms, tapping into them offers a path toward developing algorithms that have the potential to produce racially and socioeconomically just outcomes.
Keywords: race, algorithmic discrimination, big data, racial justice, criminal law, law and technology, data, mass incarceration, technology, automated decision making, alternative data, proxy, machine learning, disparate impact, inequality, data source, risk assessment tools
JEL Classification: K19, K14, K42
Suggested Citation: Suggested Citation