Whose Data, Whose Value? Simple Exercises in Data and Modeling Evaluation and Implications for Tech Law and Policy

40 Pages Posted: 15 May 2023 Last revised: 7 Jun 2023

See all articles by Aileen Nielsen

Aileen Nielsen

ETH Zurich, Center for Law and Economics

Date Written: May 1, 2023

Abstract

Scholarship on the phenomena of big data and algorithmically-driven digital environments has largely studied these technological and economic phenomena as monolithic practices, with little interest shown in the varying degrees of quality of contribution by data subjects and data processors. Taking a pragmatic, industry-inspired approach to measuring quality of contribution, this Essay finds evidence for a wide range of relative value contributions by data subjects. In some cases, a very small proportion of data from a few data subjects is sufficient to achieve the same performance on a given task as achieved with a much more voluminous dataset. Likewise, the algorithmic models generated by different data processors for the same task and with the same data resources show a wide range in quality of contribution, even in highly performance-incentivized conditions. In short, data subjects, and indeed individual data points within the same dataset, are not equal, fungible commodities, contrary to the trope of data as the new oil. Likewise, the role of talent and skill in algorithmic development is significant, as with other forms of innovation. Both of these observations have received little if any attention in discussions of data governance.

These observations of substantial variation among data subjects and data processors demonstrate that heterogeneous value contributions are likely common and so could be important in crafting appropriate law for the Big Data economy. Heterogeneity in value contribution is likely undertheorized in tech law scholarship, particularly because this heterogeneity has implications for privacy law, competition policy, and innovation. The Essay concludes in highlighting some of these implications and in posing an empirical research agenda to fill in the missing information needed to realize the potential of more nuanced policymaking that is sensitive to the wide range of talent and skill exhibited by data subjects and data processors alike.

Keywords: data valuation, algorithms, privacy, data protection, law and computer science

Suggested Citation

Nielsen, Aileen, Whose Data, Whose Value? Simple Exercises in Data and Modeling Evaluation and Implications for Tech Law and Policy (May 1, 2023). Available at SSRN: https://ssrn.com/abstract=4434409 or http://dx.doi.org/10.2139/ssrn.4434409

Aileen Nielsen (Contact Author)

ETH Zurich, Center for Law and Economics ( email )

LEE G104
Leonhardstrasse 21
Zurich
Switzerland

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
74
Abstract Views
352
Rank
593,287
PlumX Metrics