The Training of Generative AI Is Not Text and Data Mining
European Intellectual Property Review (E.I.P.R.), forthcoming 2/2025
28 Pages Posted: 19 Dec 2024 Last revised: 20 Dec 2024
Date Written: October 19, 2024
Abstract
The creative capacities of generative artificial intelligence (AI) systems can be attributed to an extensive training of the underlying models. This training utilizes massive amounts of data, most of which are protected by copyright. While the discussion in the US is conducted in light of the fair use defence, AI developers in Europe refer to the exceptions for text and data mining under the DSM Directive 2019/790. However, a closer look at the technological foundations of generative AI training reveals that the text and data mining exception does not apply. The training of generative AI models without licences for the works used as training data is therefore copyright infringement.
Keywords: AI, Artificial Intelligence, copyright, Text and Data Mining, TDM exception, generative AI models, DSM Directive, AI Act
Suggested Citation: Suggested Citation
