A Legal Framework for AI Training Data - From First Principles to the Artificial Intelligence Act
Law, Innovation and Technology (forthcoming)
39 Pages Posted: 14 Apr 2020 Last revised: 23 Jul 2021
Date Written: March 18, 2020
In response to recent regulatory initiatives at the EU level, this article shows that training data for AI do not only play a key role in the development of AI applications, but are currently only inadequately captured by EU law. In this, I focus on three central risks of AI training data: risks of data quality, discrimination and innovation. Existing EU law, with the new copyright exception for text and data mining, only addresses a part of this risk profile adequately. Therefore, the article develops the foundations for a discrimination-sensitive quality regime for data sets and AI training, which emancipates itself from the controversial question of the applicability of data protection law to AI training data. Furthermore, it spells out concrete guidelines for the re-use of personal data for AI training purposes under the GDPR. Ultimately, the legislative and interpretive task rests in striking an appropriate balance between individual protection and the promotion of innovation. The article finishes with an assessment of the proposal for an Artificial Intelligence Act in this respect.
Keywords: artificial intelligence, machine learning, training data, data protection law, anti-discrimination law, contract law, product liability, TDM exception, Artificial Intelligence Act
JEL Classification: K10, K2, K13
Suggested Citation: Suggested Citation