Protecting Genomic Data Analytics in the Cloud: State of the Art and Opportunities

BMC Medical Genomics (2016) 9:63

San Diego Legal Studies Paper No. 16-240

11 Pages Posted: 19 Oct 2016 Last revised: 21 Dec 2016

Haixu Tang

Indiana University Bloomington - School of Informatics and Computing

Xiaoqian Jiang

University of California, San Diego (UCSD)

XiaoFeng Wang

University of California, San Diego (UCSD) - Department of Biomedical Informatics

Shuang Wang

University of California, San Diego (UCSD)

Heidi Sofia

National Institutes of Health (NIH) - National Human Genome Research Institute

Dov Fox

University of San Diego: School of Law

Kristin Lauter

Microsoft Corporation - Microsoft Research - Redmond

Bradley Malin

Vanderbilt University - School of Medicine

Amalio Telenti

J. Craig Venter Institute

Li Xiong

Emory University - Department of Mathematics and Computer Science

Lucila Ohno-Machado

University of California, San Diego (UCSD)

Date Written: October 18, 2016

Abstract

The outsourcing of genomic data into public cloud computing settings raises concerns over privacy and security. Significant advancements in secure computation methods have emerged over the past several years, but such techniques need to be rigorously evaluated for their ability to support the analysis of human genomic data in an efficient and cost-effective manner. With respect to public cloud environments, there are concerns about the inadvertent exposure of human genomic data to unauthorized users. In analyses involving multiple institutions, there is additional concern about data being used beyond agreed research scope and being prcoessed in untrused computational environments, which may not satisfy institutional policies. To systematically investigate these issues, the NIH-funded National Center for Biomedical Computing iDASH (integrating Data for Analysis, ‘anonymization’ and SHaring) hosted the second Critical Assessment of Data Privacy and Protection competition to assess the capacity of cryptographic technologies for protecting computation over human genomes in the cloud and promoting cross-institutional collaboration. Data scientists were challenged to design and engineer practical algorithms for secure outsourcing of genome computation tasks in working software, whereby analyses are performed only on encrypted data. They were also challenged to develop approaches to enable secure collaboration on data from genomic studies generated by multiple organizations (e.g., medical centers) to jointly compute aggregate statistics without sharing individual-level records. The results of the competition indicated that secure computation techniques can enable comparative analysis of human genomes, but greater efficiency (in terms of compute time and memory utilization) are needed before they are sufficiently practical for real world environments.

Keywords: Genomic Data, Privacy, Security, NIH, National Center for Biomedical Computing

Suggested Citation

Tang, Haixu and Jiang, Xiaoqian and Wang, XiaoFeng and Wang, Shuang and Sofia, Heidi and Fox, Dov and Lauter, Kristin and Malin, Bradley and Telenti, Amalio and Xiong, Li and Ohno-Machado, Lucila, Protecting Genomic Data Analytics in the Cloud: State of the Art and Opportunities (October 18, 2016). BMC Medical Genomics (2016) 9:63; San Diego Legal Studies Paper No. 16-240. Available at SSRN: https://ssrn.com/abstract=2854352

Haixu Tang

Indiana University Bloomington - School of Informatics and Computing ( email )

Informatics West, Room 204
901 E. 10th Street
Bloomington, IN 47408
United States

Xiaoqian Jiang

University of California, San Diego (UCSD) ( email )

9500 Gilman Drive
Mail Code 0502
La Jolla, CA 92093-0112
United States

XiaoFeng Wang

University of California, San Diego (UCSD) - Department of Biomedical Informatics ( email )

9500 Gilman Drive
La Jolla, CA 92093
United States

Shuang Wang

University of California, San Diego (UCSD) ( email )

9500 Gilman Drive
Mail Code 0502
La Jolla, CA 92093-0112
United States

Heidi Sofia

National Institutes of Health (NIH) - National Human Genome Research Institute ( email )

Building 29, Room $a56
49 Convent Dr, MSC 4472
Bethesda, MD 20892
United States

Dov Fox (Contact Author)

University of San Diego: School of Law ( email )

5998 Alcalá Park
San Diego, CA 92110
United States
(619) 260-4600 (Phone)

HOME PAGE: http://www.sandiego.edu/law/news/news_releases/newslist.php?_focus=44957

Kristin Lauter

Microsoft Corporation - Microsoft Research - Redmond ( email )

Building 99
Redmond, WA
United States

Bradley Malin

Vanderbilt University - School of Medicine ( email )

Nashville, TN 37232-0685
United States

Amalio Telenti

J. Craig Venter Institute ( email )

4120 Capricorn Lane
La Jolla, CA 92037
United States

Li Xiong

Emory University - Department of Mathematics and Computer Science ( email )

Atlanta, GA
United States

Lucila Ohno-Machado

University of California, San Diego (UCSD) ( email )

9500 Gilman Drive
Mail Code 0502
La Jolla, CA 92093-0112
United States

Register to save articles to
your library

Register

Paper statistics

Downloads
23
Abstract Views
109
PlumX