The Limits of Algorithmic Measures of Race in Studies of Outcome Disparities
35 Pages Posted: 26 Apr 2023
Date Written: April 22, 2023
Abstract
We show that the use of algorithms to predict race has significant limitations in measuring and understanding the sources of racial disparities in finance, economics, and other contexts. First, we derive theoretically the direction and magnitude of measurement bias in estimates of unconditional disparities that use predicted instead of actual race. If their prediction errors were random, existing algorithms such as BIFSG (Voicu, 2018) would underestimate disparities in credit access for Black borrowers by 30–50%. In practice, the algorithms are systematically biased toward identifying minority borrowers who are likely to experience worse outcomes. Second, we show that in many applications the accuracy of predicted race is illusory, as many empirical methodologies call for the inclusion of location fixed effects and comparison of white and minority individuals within a given geography. As a result, estimates of conditional disparities can be dramatically underestimated, in some of our analyses, by up to 60%. While underestimating conditional disparities, predicted race overstates the importance of location in explaining disparities. Finally, because algorithm accuracy can vary across subsamples, predicted race can under- or overestimate interaction effects meant to measure cross-sectional variation in disparities.
Keywords: machine learning, race, measurement error, racial disparities, Paycheck Protection Program
JEL Classification: G20, G21, G38
Suggested Citation: Suggested Citation