Pay With Your Data: Optimal Data-Sharing Mechanisms for AI Services

41 Pages Posted: 28 Aug 2023 Last revised: 30 Aug 2023

See all articles by Sameer Mehta

Sameer Mehta

Rotterdam School of Management, Erasmus University

Chandrasekhar Manchiraju

Eli Broad College of Business, Michigan State University

Milind Dawande

University of Texas at Dallas - Department of Information Systems & Operations Management

Ganesh Janakiraman

University of Texas at Dallas - Naveen Jindal School of Management

Date Written: August 26, 2023

Abstract

Rapid advances in Machine Learning (ML) have led to a proliferation of Artificial Intelligence (AI) services offered by firms. To develop a valuable AI service, a firm must build an accurate ML model which, in turn, requires a large amount of training data. Present-day firms usually obtain this data by offering incentives to consumers to share their data during the initial development phase of the AI service, and then use that data to re-train the ML models to improve the quality of the service. Consumers, on the other hand, incur privacy costs for sharing their data. Inspired by AI services such as speech-to-text conversion offered by Google, and ChatGPT and DALL-E offered by OpenAI, we analyze two popular data-sharing mechanisms that firms employ in practice: manual data-sharing and algorithmic data-sharing. In the former approach, consumers decide the amount of data to share with the firm, whereas in the latter, the firm uses algorithmic data-redaction – an established approach used by technology firms such as Amazon, IBM, and Oracle – to identify and censor sensitive segments of data, and determine the amount of data collected from consumers. For both the data-sharing approaches, we obtain revenue-maximizing mechanisms for the firm and analyze the fundamental differences between the two approaches in terms of the revenue accrued by the firm, the consumer surplus, and the volume of data collected. Our analysis uncovers several interesting economic effects: For instance, we show that the firm can obtain a higher revenue with an inferior data-redaction algorithm and highlight two nuanced effects – namely, a weak privacy-cost-compensation effect and a strong data-collection effect – underlying this behavior.

Keywords: data-sharing mechanisms, artificial intelligence services, mechanism design

JEL Classification: D47, D40

Suggested Citation

Mehta, Sameer and Manchiraju, Chandrasekhar and Dawande, Milind and Janakiraman, Ganesh, Pay With Your Data: Optimal Data-Sharing Mechanisms for AI Services (August 26, 2023). Available at SSRN: https://ssrn.com/abstract=4552550 or http://dx.doi.org/10.2139/ssrn.4552550

Sameer Mehta (Contact Author)

Rotterdam School of Management, Erasmus University ( email )

RSM Erasmus University
PO Box 1738
Rotterdam, 3062 PA
Netherlands

Chandrasekhar Manchiraju

Eli Broad College of Business, Michigan State University ( email )

632 Bogue St
East Lansing, MI 48824
United States

Milind Dawande

University of Texas at Dallas - Department of Information Systems & Operations Management ( email )

P.O. Box 830688
Richardson, TX 75083-0688
United States

Ganesh Janakiraman

University of Texas at Dallas - Naveen Jindal School of Management ( email )

P.O. Box 830688
Richardson, TX 75083-0688
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
193
Abstract Views
578
Rank
279,998
PlumX Metrics