Using Sensitive Personal Data May Be Necessary for Avoiding Discrimination in Data-Driven Decision Models

Zliobaite I. & Custers B. (2016), Using sensitive personal data may be necessary for avoiding discrimination in data-driven decision models, Artificial Intelligence and Law (24): 183-201.

20 Pages Posted: 4 Oct 2017 Last revised: 7 Oct 2017

See all articles by Indre Zliobaite

Indre Zliobaite

Independent

Bart Custers

Leiden University - Center for Law and Digital Technologies

Date Written: June 1, 2016

Abstract

Increasing numbers of decisions about everyday life are made using algorithms. By algorithms we mean predictive models (decision rules) captured from historical data using data mining. Such models often decide prices we pay, select ads we see and news we read online, match job descriptions and candidate CVs, decide who gets a loan, who goes through an extra airport security check, or who gets released on parole. Yet growing evidence suggests that decision making by algorithms may discriminate people, even if the computing process is fair and wellintentioned. This happens due to biased or non-representative learning data in combination with inadvertent modeling procedures. From the regulatory perspective there are two tendencies in relation to this issue: (1) to ensure that data-driven decision making is not discriminatory, and (2) to restrict overall collecting and storing of private data to a necessary minimum. This paper shows that from the computing perspective these two goals are contradictory. We demonstrate empirically and theoretically with standard regression models that in order to make sure

that decision models are non-discriminatory, for instance, with respect to race, the sensitive racial information needs to be used in the model building process. Of course, after the model is ready, race should not be required as an input variable for decision making. From the regulatory perspective this has an important implication: collecting sensitive personal data is necessary in order to guarantee fairness of algorithms, and law making needs to find sensible ways to allow using such data in the modelling process.

Keywords: Non-discrimination,Fairness, Regression, Data mining

Suggested Citation

Zliobaite, Indre and Custers, Bart, Using Sensitive Personal Data May Be Necessary for Avoiding Discrimination in Data-Driven Decision Models (June 1, 2016). Zliobaite I. & Custers B. (2016), Using sensitive personal data may be necessary for avoiding discrimination in data-driven decision models, Artificial Intelligence and Law (24): 183-201., Available at SSRN: https://ssrn.com/abstract=3047233

Indre Zliobaite

Independent ( email )

Bart Custers (Contact Author)

Leiden University - Center for Law and Digital Technologies ( email )

2300 RA Leiden, NL-2300RA
Netherlands

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
382
Abstract Views
1,439
Rank
142,359
PlumX Metrics