Strategic Behavior and AI Training Data

32 Pages Posted: 29 Apr 2024

See all articles by Christian Peukert

Christian Peukert

University of Lausanne - Faculty of Business and Economics (HEC Lausanne)

Florian Abeillon

University of Lausanne - Faculty of Business and Economics (HEC Lausanne)

Jérémie Haese

University of Lausanne - Faculty of Business and Economics (HEC Lausanne)

Franziska Kaiser

University of Lausanne - Faculty of Business and Economics (HEC Lausanne)

Alexander Staub

University of Lausanne - Faculty of Business and Economics (HEC Lausanne)

Multiple version iconThere are 2 versions of this paper

Date Written: April 28, 2024

Abstract

Human-created works represent critical data inputs to artificial intelligence (AI). Strategic behavior can play a major role for AI training datasets, be it in limiting access to existing works or in deciding which types of new works to create or whether to create new works at all. We examine creators' behavioral change when their works become training data for AI. Specifically, we focus on contributors on Unsplash, a popular stock image platform with about 6 million high-quality photos and illustrations. In the summer of 2020, Unsplash launched an AI research program by releasing a dataset of 25,000 images for commercial use. We study contributors' reactions, comparing contributors whose works were included in this dataset to contributors whose works were not included. Our results suggest that treated contributors left the platform at a higher-than-usual rate and substantially slowed down the rate of new uploads. Professional and more successful photographers react stronger than amateurs and less successful photographers. We also show that affected users changed the variety and novelty of contributions to the platform, with long-run implications for the stock of works potentially available for AI training. Taken together, our findings highlight the trade-off between interests of rightsholders and promoting innovation at the technological frontier. We discuss implications for copyright and AI policy.

Keywords: Generative Artificial Intelligence, Training Data, Licensing, Copyright, Natural Experiment

Suggested Citation

Peukert, Christian and Abeillon, Florian and Haese, Jérémie and Kaiser, Franziska and Staub, Alexander, Strategic Behavior and AI Training Data (April 28, 2024). Available at SSRN: https://ssrn.com/abstract=4807979 or http://dx.doi.org/10.2139/ssrn.4807979

Christian Peukert (Contact Author)

University of Lausanne - Faculty of Business and Economics (HEC Lausanne) ( email )

Switzerland

HOME PAGE: http://https://www.christian-peukert.com/

Florian Abeillon

University of Lausanne - Faculty of Business and Economics (HEC Lausanne) ( email )

Switzerland

Jérémie Haese

University of Lausanne - Faculty of Business and Economics (HEC Lausanne) ( email )

Switzerland

Franziska Kaiser

University of Lausanne - Faculty of Business and Economics (HEC Lausanne) ( email )

Quartier UNIL-Chamberonne
Lausanne, 1015
Switzerland

Alexander Staub

University of Lausanne - Faculty of Business and Economics (HEC Lausanne) ( email )

Switzerland

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
241
Abstract Views
1,525
Rank
227,119
PlumX Metrics