Copyright Law and Generative AI Training - Technological and Legal Foundations
(Urheberrecht und Training generativer KI-Modelle - Technologische und juristische Grundlagen)

The paper is an Open Access book in the NOMOS publishers' series.
https://www.nomos-elibrary.de/10.5771/9783748949558/urheberrecht-und-training-generativer-ki-modelle?page=1

Posted: 9 Oct 2024

See all articles by Tim W. Dornis

Tim W. Dornis

Leibniz University Hannover; New York University School of Law

Sebastian Stober

Otto-von-Guericke University, Magdeburg

Date Written: August 29, 2024

Abstract

Generative AI is transforming creative fields by rapidly producing texts, images, music, and videos. These AI creations often seem as impressive as human-made works but require extensive training on vast amounts of data, much of which are copyright protected. This dependency on copyrighted material has sparked legal debates, as AI training involves “copying” and “reproducing” these works, actions that could potentially infringe on copyrights. In defense, AI proponents in the United States invoke “fair use” under Section 107 of the Copyright Act, while in Europe, they cite Article 4(1) of the 2019 DSM Directive, which allows certain uses of copyrighted works for “text and data mining.”

This study challenges the prevailing European legal stance, presenting several arguments:

1. The exception for text and data mining should not apply to generative AI training because the technologies differ fundamentally - one processes semantic information only, while the other also extracts syntactic information.

2. There is no suitable copyright exception or limitation to justify the massive infringements occurring during the training of generative AI. This concerns the copying of protected works during data collection, the full or partial replication inside the AI model, and the reproduction of works from the training data initiated by the end-users of AI systems like ChatGPT.

3. Even if AI training occurs outside Europe, developers cannot fully avoid European copyright laws. If works are replicated inside an AI model, making the model available in Europe could infringe the “right of making available“ under Article 3 of the InfoSoc Directive. Accordingly, offering AI services to European users ultimately subjects developers to European copyright laws and European courts’ jurisdiction.

This study suggests to rethink copyright issues in the context of AI. Given the technical revolution and socio-economic disruptions generative AI brings, lawmakers should reconsider how to balance protection of human creativity with the interest in AI innovation. The current lack of regulation neglects the technical realities and is thus not only legally unsound but also unjust.

Note: Downloadable document is in German.

Keywords: Artificial Intelligence, generative AI, intellectual property, copyright, text and data mining, TDM, Künstliche Intelligenz, generative KI-Modelle, Urheberrecht, TDM-Schranke, Urheberrecht, ChatGPT, Stable Diffusion, DSM Directive

Suggested Citation

Dornis, Tim W. and Stober, Sebastian,
Copyright Law and Generative AI Training - Technological and Legal Foundations
(Urheberrecht und Training generativer KI-Modelle - Technologische und juristische Grundlagen)
(August 29, 2024).
The paper is an Open Access book in the NOMOS publishers' series.
https://www.nomos-elibrary.de/10.5771/9783748949558/urheberrecht-und-training-generativer-ki-modelle?page=1
, Available at SSRN: https://ssrn.com/abstract=4946214

Tim W. Dornis (Contact Author)

Leibniz University Hannover

Königsworther Platz 1
Hannover, 30167
Germany

New York University School of Law ( email )

40 Washington Square South
New York, NY 10012-1099
United States

Sebastian Stober

Otto-von-Guericke University, Magdeburg ( email )

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Abstract Views
5,770
PlumX Metrics