puc-header

Recovering Gene Interactions from Single-Cell Data Using Data Diffusion

104 Pages Posted: 13 Apr 2018 Sneak Peek Status: Published

See all articles by David Van Dijk

David Van Dijk

Memorial Sloan Kettering Cancer Center - Program for Computational and Systems Biology

Roshan Sharma

Memorial Sloan Kettering Cancer Center - Program for Computational and Systems Biology; Columbia University - Department of Applied Physics and Applied Mathematics

Juoas Nainys

Vilnius University - Sector of Microtechnologies; Columbia University - Department of Biological Sciences

Kristina Yim

Yale University - Department of Genetics

Pooja Kathail

Memorial Sloan Kettering Cancer Center - Program for Computational and Systems Biology; Columbia University - Department of Biological Sciences

Ambrose Carr

Memorial Sloan Kettering Cancer Center - Program for Computational and Systems Biology; Columbia University - Department of Biological Sciences

Cassandra Burdziak

Memorial Sloan Kettering Cancer Center - Program for Computational and Systems Biology

Kevin R. Moon

Yale University - Department of Genetics

Christine L. Chaffer

Garvan Institute of Medical Research

Diwakar Pattabiraman

Massachusetts Institute of Technology (MIT) - Whitehead Institute for Biomedical Research

Brian Bierie

Massachusetts Institute of Technology (MIT) - Whitehead Institute for Biomedical Research

Linas Mazutis

Memorial Sloan Kettering Cancer Center - Program for Computational and Systems Biology

Guy Wolf

Yale University - Applied Mathematics Program

Smita Krishnaswamy

Yale University - Department of Genetics

Dana Pe’er

Memorial Sloan Kettering Cancer Center - Program for Computational and Systems Biology

More...

Abstract

Single-cell RNA-sequencing is revolutionizing biological discovery. However, scRNA-seq technologies suffer from many sources of significant technical noise, the most prominent being undersampling of mRNA molecules, often termed ‘dropout’. Dropout can severely obscure important gene-gene relationships and impedes the possibility of learning gene regulatory networks at single cell resolution. To address this, we developed MAGIC (Markov Affinity-based Graph Imputation of Cells), a computational approach that shares information across similar cells, via data diffusion, to correct the mRNA count matrix and fill in missing transcripts. We validate MAGIC on a number of biological systems and find it effective at recovering gene-gene relationships and additional structures. We use MAGIC to explore the epithelial-to-mesenchymal transition (EMT) and reveal a phenotypic continuum of states, with the majority of cells residing in intermediate states that display stem-like signatures. Further, MAGIC uncovers the dynamics of gene expression underlying EMT, including known and novel regulatory interactions, demonstrating that our approach is able to successfully predict regulatory relations without perturbations.

Suggested Citation

Van Dijk, David and Sharma, Roshan and Nainys, Juoas and Yim, Kristina and Kathail, Pooja and Carr, Ambrose and Burdziak, Cassandra and Moon, Kevin R. and Chaffer, Christine L. and Pattabiraman, Diwakar and Bierie, Brian and Mazutis, Linas and Wolf, Guy and Krishnaswamy, Smita and Pe’er, Dana, Recovering Gene Interactions from Single-Cell Data Using Data Diffusion (2018). Available at SSRN: https://ssrn.com/abstract=3155779 or http://dx.doi.org/10.2139/ssrn.3155779
This is a paper under consideration at Cell Press and has not been peer-reviewed.

David Van Dijk (Contact Author)

Memorial Sloan Kettering Cancer Center - Program for Computational and Systems Biology

New York, NY
United States

Roshan Sharma

Memorial Sloan Kettering Cancer Center - Program for Computational and Systems Biology

New York, NY
United States

Columbia University - Department of Applied Physics and Applied Mathematics

New York, NY 10027
United States

Juoas Nainys

Vilnius University - Sector of Microtechnologies

Vilnius
Lithuania

Columbia University - Department of Biological Sciences

New York, NY 10027
United States

Kristina Yim

Yale University - Department of Genetics

333 Cedar Street
New Haven, CT 06520
United States

Pooja Kathail

Memorial Sloan Kettering Cancer Center - Program for Computational and Systems Biology

New York, NY
United States

Columbia University - Department of Biological Sciences

New York, NY 10027
United States

Ambrose Carr

Memorial Sloan Kettering Cancer Center - Program for Computational and Systems Biology

New York, NY
United States

Columbia University - Department of Biological Sciences

New York, NY 10027
United States

Cassandra Burdziak

Memorial Sloan Kettering Cancer Center - Program for Computational and Systems Biology

New York, NY
United States

Kevin R. Moon

Yale University - Department of Genetics

333 Cedar Street
New Haven, CT 06520
United States

Christine L. Chaffer

Garvan Institute of Medical Research

384 VICTORIA STREET
Darlinghurst, New South Wales 2010
Australia

Diwakar Pattabiraman

Massachusetts Institute of Technology (MIT) - Whitehead Institute for Biomedical Research

Cambridge, MA
United States

Brian Bierie

Massachusetts Institute of Technology (MIT) - Whitehead Institute for Biomedical Research

Cambridge, MA
United States

Linas Mazutis

Memorial Sloan Kettering Cancer Center - Program for Computational and Systems Biology

New York, NY
United States

Guy Wolf

Yale University - Applied Mathematics Program

51 Prospect Street
New Haven, CT 06511
United States

Smita Krishnaswamy

Yale University - Department of Genetics ( email )

333 Cedar Street
New Haven, CT 06520
United States

Dana Pe’er

Memorial Sloan Kettering Cancer Center - Program for Computational and Systems Biology

New York, NY
United States

Click here to go to Cell.com

Go to Cell.com

Paper statistics

Abstract Views
1,225
Downloads
59