Economies of Scope in Data Aggregation: Evidence from Health Data
32 Pages Posted: 9 Feb 2023
Date Written: November 10, 2022
Economies of scope in data aggregation (ESDA) are attracting the attention of policymakers and researchers because of the efficiency gains they could bring about. Antitrust authorities, in turn, are concerned about their potential anti-competitive outcomes. However, the concept remains blurry and lacks empirical backing. We provide a definition: the improvement in the predictive power of a dataset resulting from adding complementary variables to it. It differs from traditional economies of scope, which are based on re-use of data or other resources. After deriving a theoretical model of ESDA, we estimate it by progressively adding explanatory variables to a dataset of health and health-related data that we use to predict health outcomes. Our three main findings confirm the existence of ESDA and lead to novel policy implications. First, in our dataset, a 1% increase in the number of predictor variables improves prediction accuracy in a range from 0.087% to 0.132%. Second, we find a positive non-linear relation between variable complementarity and ESDA. Third, in our models, ESDA are subject to increasing returns up to the third quartile of variables, and to diminishing returns thereafter. Our results support policies fostering the concentration of data in large pools with shared use rights.
Funding Information: This study was financed by the Joint Research Centre of the European Commission under contract number 940149/2020NL.
Declaration of Interests: All co-authors of this paper declare that they do not have any conflict of interest in this study
Keywords: Economies of scope, Health, Data aggregation, Predictive modelling, Machine learning
JEL Classification: D24, L86, I10
Suggested Citation: Suggested Citation