Economies of Scope in Data Aggregation: Evidence from Health Data

32 Pages Posted: 9 Feb 2023

See all articles by Seyit Hocuk

Seyit Hocuk

Centerdata

Bertin Martens

Tilburg Law and Economics Center (TILEC); Bruegel

Patricia Prufer

CentERdata; Tilburg University

Bruno Carballa Smichowski

Joint Research Centre (European Commission); Centre d’Economie de l’Université Paris Nord (CEPN) – UMR 7234 CNRS-Université Sorbonne, Paris 13.

Néstor Duch-Brown

Joint Research Centre - European Commission

Pradeep Kumar

CentERdata

Joris Mulder

CentERdata; Tilburg University

Date Written: November 10, 2022

Abstract

Economies of scope in data aggregation (ESDA) are attracting the attention of policymakers and researchers because of the efficiency gains they could bring about. Antitrust authorities, in turn, are concerned about their potential anti-competitive outcomes. However, the concept remains blurry and lacks empirical backing. We provide a definition: the improvement in the predictive power of a dataset resulting from adding complementary variables to it. It differs from traditional economies of scope, which are based on re-use of data or other resources. After deriving a theoretical model of ESDA, we estimate it by progressively adding explanatory variables to a dataset of health and health-related data that we use to predict health outcomes. Our three main findings confirm the existence of ESDA and lead to novel policy implications. First, in our dataset, a 1% increase in the number of predictor variables improves prediction accuracy in a range from 0.087% to 0.132%. Second, we find a positive non-linear relation between variable complementarity and ESDA. Third, in our models, ESDA are subject to increasing returns up to the third quartile of variables, and to diminishing returns thereafter. Our results support policies fostering the concentration of data in large pools with shared use rights.

Note:
Funding Information: This study was financed by the Joint Research Centre of the European Commission under contract number 940149/2020NL.

Declaration of Interests: All co-authors of this paper declare that they do not have any conflict of interest in this study

Keywords: Economies of scope, Health, Data aggregation, Predictive modelling, Machine learning

JEL Classification: D24, L86, I10

Suggested Citation

Höcük, Seyit and Martens, Bertin and Prufer, Patricia and Carballa Smichowski, Bruno and Duch-Brown, Néstor and Kumar, Pradeep and Mulder, Joris, Economies of Scope in Data Aggregation: Evidence from Health Data (November 10, 2022). TILEC Discussion Paper No. 020, 2022, Available at SSRN: https://ssrn.com/abstract=4338447. or http://dx.doi.org/10.2139/ssrn.4338447

Seyit Höcük

Centerdata ( email )

Warandelaan 2
Tilburg
Netherlands

Bertin Martens

Tilburg Law and Economics Center (TILEC) ( email )

Warandelaan 2
Tilburg, 5000 LE
Netherlands

Bruegel ( email )

Rue de la Charité 33
B-1210 Brussels Belgium, 1210
Belgium

Patricia Prufer

CentERdata ( email )

PO Box 90153
Tilburg, NL 5000 LE
Netherlands

Tilburg University ( email )

Department of Economics
CentER
Tilburg, 5032 RE
Netherlands

HOME PAGE: http://center.uvt.nl/phd_stud/prufer/

Bruno Carballa Smichowski (Contact Author)

Joint Research Centre (European Commission) ( email )

Centre d’Economie de l’Université Paris Nord (CEPN) – UMR 7234 CNRS-Université Sorbonne, Paris 13. ( email )

France

Néstor Duch-Brown

Joint Research Centre - European Commission ( email )

Edificio Expo, C
Inca Garcilaso, 3
Sevilla, E-41092
Spain

Pradeep Kumar

CentERdata ( email )

PO Box 90153
Tilburg, NL 5000 LE
Netherlands

Joris Mulder

CentERdata ( email )

PO Box 90153
Tilburg, NL 5000 LE
Netherlands

HOME PAGE: http://https://www.centerdata.nl/en/team/joris-mulder

Tilburg University ( email )

P.O. Box 90153
Tilburg, Noord-Brabant 5000 LE
Netherlands

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
138
Abstract Views
446
Rank
370,040
PlumX Metrics