Centralization, Fragmentation, and Replication in the Genomic Data Commons

Centralization, Fragmentation, and Replication in the Genomic Data Commons, in Governing Medical Knowledge Commons (Brett M. Frischmann, Michael J. Madison, and Katherine J. Strandburg eds., Cambridge University Press 2017)

UC Davis Legal Studies Research Paper No. 448

30 Pages Posted: 26 Aug 2015 Last revised: 4 Jan 2018

See all articles by Peter Lee

Peter Lee

University of California, Davis - School of Law

Date Written: August 24, 2015

Abstract

Researchers around the world deposit enormous amounts of genomic sequence data and related information into public databases, thus creating a genomic data commons. This chapter examines specific governance challenges of correcting, updating, and annotating these data. Delving into the science of genome sequencing, assembly, and annotation, it highlights the indeterminate nature of sequence data and related information and the high rate of errors in public databases such as GenBank. Drawing on the Institutional Analysis and Development framework, it then examines four approaches for dynamically correcting and modifying these data: author-centric data management, third-party biocuration, community-based wikification, and specialized databases and genome browsers. Notably, these approaches reveal deep tensions between centralization and fragmentation in the structure of the genomic data commons. On the one hand, author-centric data management and third-party biocuration represent highly centralized mechanisms for controlling data. On the other hand, wiki-based annotation disperses control throughout the community, exploiting the power of the commons and parallel data analysis to update existing data records. Attempting to capture the best of both worlds, specialized databases and genome browsers exploit replication and the nonrivalrous nature of information to preserve original data records while allowing users to codify vast amounts of value-added knowledge. This study shows that far from a being a passive repository of information, the genomic data commons is a teeming, dynamic entity in which communal intervention is critical to enhancing collective knowledge. Ultimately, the genomic data commons is an intensely human commons in more ways than one.

Keywords: commons, genomics, GenBank, data, databases, biocuration, wikification, gene browsers, NIH

Suggested Citation

Lee, Peter, Centralization, Fragmentation, and Replication in the Genomic Data Commons (August 24, 2015). Centralization, Fragmentation, and Replication in the Genomic Data Commons, in Governing Medical Knowledge Commons (Brett M. Frischmann, Michael J. Madison, and Katherine J. Strandburg eds., Cambridge University Press 2017); UC Davis Legal Studies Research Paper No. 448. Available at SSRN: https://ssrn.com/abstract=2650189

Peter Lee (Contact Author)

University of California, Davis - School of Law ( email )

Martin Luther King, Jr. Hall
Davis, CA CA 95616-5201
United States

Register to save articles to
your library

Register

Paper statistics

Downloads
72
Abstract Views
878
rank
331,460
PlumX Metrics