Data Rivers: Carving Out the Public Domain in the Age of Generative AI

23 Pages Posted: 27 Apr 2023 Last revised: 15 May 2023

See all articles by Sylvie Delacroix

Sylvie Delacroix

University of Birmingham - Birmingham Law School; The Alan Turing Institute

Date Written: March 14, 2023

Abstract

The salient question, today, is not whether ‘copyright law [will] allow robots to learn’. The pressing question is whether the fragile data ecosystem that makes generative AI possible can be re-balanced through intervention that is timely enough. The threats to this ecosystem come from multiple fronts. They are comparable in kind to the threats currently affecting ‘water rivers’ across the globe.

First, just as the fundamental human right to water is only possible if ‘reasonable use’ and reciprocity constraints are imposed on the economic exploitation of rivers, so is the fundamental right to access culture, learn and build upon it. It is that right -and the moral aspirations underlying it- that has led millions to share their creative works under ‘open’ licenses. Generative AI tools would not have been possible without access to that rich, high-quality content. Yet few of those tools respect the reciprocity expectations without which the Creative Commons and Open-Source movements cease to be sustainable. The absence of internationally coordinated standards to systematically identify AI-generated content also threatens our ‘data rivers’ with irreversible pollution.

Second, the process that has allowed large corporations to seize control of data and its definition as an asset subject to property rights has effectively enabled the construction of hard structures -canals or dams- that has led to the rights of many of those lying up-or downstream of such structures to be ignored. While data protection laws seek to address those power imbalances by granting ‘personal’ data rights, the exercise of those rights remains demanding, just as it is challenging for artists to defend their IP rights in the face of AI-generated works that threaten them with redundancy.

To tackle the above threats, the long overdue reform of copyright can only be part of the required intervention. Equally important is the construction of bottom-up empowerment infrastructure that gives long term agency to those wishing to share their data and/or creative works. This infrastructure would also play a central role in reviving much-needed democratic engagement. Data not only carries traces of our past. It is also a powerful tool to envisage different futures. There is no doubt that tools such as GPT4 will change us. We would be fools to believe we may leverage those tools at the service of a variety of futures by merely imposing sets of ‘post-hoc’ regulatory constraints.

Keywords: Large language models, GPT 4, generative AI, public domain, copyright, empowerment, data trusts, privilege

Suggested Citation

Delacroix, Sylvie, Data Rivers: Carving Out the Public Domain in the Age of Generative AI (March 14, 2023). Available at SSRN: https://ssrn.com/abstract=4388928 or http://dx.doi.org/10.2139/ssrn.4388928

Sylvie Delacroix (Contact Author)

University of Birmingham - Birmingham Law School ( email )

Edgbaston
Birmingham, AL B15 2TT
United Kingdom

HOME PAGE: http://https://www.birmingham.ac.uk/staff/profiles/law/delacroix-sylvie.aspx

The Alan Turing Institute ( email )

96 Euston Road
London, NW1 2DB
United Kingdom

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
333
Abstract Views
1,487
Rank
146,123
PlumX Metrics