puc-header

Multi-Ancestry Transcriptome Prediction with Functionally Informed Variants in TOPMed MESA Improves Performance of Transcriptome-Wide Association Studies

34 Pages Posted: 28 Mar 2025 Publication Status: Under Review

See all articles by Xiaowei Hu

Xiaowei Hu

University of Virginia

Daniel S. Araujo

University of Chicago

Chachrit Khunsriraksakul

Pennsylvania State University

Lida Wang

Pennsylvania State University

Quan Sun

University of North Carolina (UNC) at Chapel Hill

Jia Wen

University of North Carolina (UNC) at Chapel Hill

Lingbo Zhou

University of North Carolina (UNC) at Chapel Hill

Lynette Ekunwe

University of Mississippi - University of Mississippi Medical Center

Leslie A. Lange

University of Colorado, Aurora - Anschutz Medical Campus

Ethan M. Lange

University of Colorado, Aurora - Anschutz Medical Campus

Stephen B. Montgomery

Stanford University - Department of Pathology

Alexander P. Reiner

University of Washington

Francois Aguet

Broad Institute of MIT and Harvard

Kristin G. Ardlie

Broad Institute of MIT and Harvard

Tuuli Lappalainen

New York Genome Center; Columbia University; Royal Institute of Technology (KTH)

Christopher R. Gignoux

University of Colorado, Aurora - Anschutz Medical Campus; University of Colorado Denver

Esteban Burchard

University of California, San Francisco (UCSF)

Kent D. Taylor

University of California, Los Angeles (UCLA)

Xiuqing Guo

University of California, Los Angeles (UCLA)

Jerome I. Rotter

University of California, Los Angeles (UCLA)

Stephen S. Rich

University of Virginia - Center for Public Health Genomics

Elaine Cornell

University of Vermont - Department of Pathology and Laboratory Medicine

Peter Durda

University of Vermont

Russell P. Tracy

University of Vermont - Department of Pathology & Laboratory Medicine

Yongmei Liu

Duke University

W. Craig Johnson

University of Washington - Department of Biostatistics

George P. Papanicolaou

Government of the United States of America - National Heart, Lung and Blood Institute

Minoli A. Perera

Northwestern University

Michael H. Cho

Harvard University

Dajiang J. Liu

Pennsylvania State University

Laura M. Raffield

University of North Carolina (UNC) at Chapel Hill

Yun Li

University of North Carolina (UNC) at Chapel Hill - Department of Biostatistics

TOPMed Multi-Omics Working Group

Independent

Heather E. Wheeler

Loyola University of Chicago

Hae Kyung Im

University of Chicago

Ani Manichaikul

University of Virginia - Center for Public Health Genomics

More...

Abstract

Reliable reference transcriptome prediction models are key to accurate transcriptome-wide association study (TWAS). With the emergence of multi-ancestry genome-wide association study (GWAS), there is a need for reliable multi-ancestry transcriptome prediction models for downstream TWAS efforts. Here, we propose three methods leveraging functionally informed variants (FIVs), hereinafter referred to as FIV-based methods, that are more likely to influence gene expression to improve multi-ancestry TWAS. We trained transcriptome prediction models on 1,287 multi-ancestry participants from the Trans-Omics for Precision Medicine (TOPMed) program Multi-Ethnic Study of Atherosclerosis (MESA) with RNA-seq data from peripheral blood mononuclear cells (PBMCs). We validated models’ prediction accuracy on two external independent data sets, Geuvadis and the Jackson Heart Study (JHS). To test robustness of our FIV-based methods for multi-ancestry TWAS, we integrated developed transcriptome prediction models with three large-scale multi-ancestry GWASs from blood cell, lipid, and pulmonary function traits, respectively. Our FIV-based methods presented similar prediction accuracy but with a smaller and more accurate set of variants compared to the benchmark method, Elastic Net. Additionally, our FIV-based methods achieved significantly higher TWAS power for three GWAS traits (P<0.05 from Mann-Whitney U test) and produced higher TWAS accuracy by F1 score for all GWAS traits except two blood cell traits (with average improved accuracy of 24% over EN). However, no single proposed method outperformed in all GWAS traits. To further improve the TWAS performance, we propose an omnibus approach that aggregates TWAS summary statistics from our FIV-based methods. The omnibus approach yielded the highest number of Bonferroni-significant TWAS genes for all GWAS traits, and it further improved TWAS power and accuracy for blood cell traits. Additionally, the omnibus approach detected some trait-relevant important genes that the EN missed. We provided three examples in the manuscript for the demonstration of improvement from our omnibus approach. Our study demonstrates the value of including FIVs in multi-ancestry transcriptome prediction models for improving TWAS performance. Further, the improvement of TWAS performance depends on the GWAS trait’s relevance to the tissue or cell-type used to build transcriptome prediction models.

Keywords: Transcriptome prediction models, transcriptome-wide association study, functional annotation, multi-ancestry

Suggested Citation

Hu, Xiaowei and Araujo, Daniel S. and Khunsriraksakul, Chachrit and Wang, Lida and Sun, Quan and Wen, Jia and Zhou, Lingbo and Ekunwe, Lynette and Lange, Leslie A. and Lange, Ethan M. and Montgomery, Stephen B. and Reiner, Alexander P. and Aguet, Francois and Ardlie, Kristin G. and Lappalainen, Tuuli and Gignoux, Christopher R. and Burchard, Esteban and Taylor, Kent D. and Guo, Xiuqing and Rotter, Jerome I. and Rich, Stephen S. and Cornell, Elaine and Durda, Peter and Tracy, Russell P. and Liu, Yongmei and Johnson, W. Craig and Papanicolaou, George P. and Perera, Minoli A. and Cho, Michael H. and Liu, Dajiang J. and Raffield, Laura M. and Li, Yun and Group, TOPMed Multi-Omics Working and Wheeler, Heather E. and Im, Hae Kyung and Administrator, Sneak Peek and Manichaikul, Ani, Multi-Ancestry Transcriptome Prediction with Functionally Informed Variants in TOPMed MESA Improves Performance of Transcriptome-Wide Association Studies. Available at SSRN: https://ssrn.com/abstract=5194962 or http://dx.doi.org/10.2139/ssrn.5194962
This version of the paper has not been formally peer reviewed.

Xiaowei Hu (Contact Author)

University of Virginia ( email )

Daniel S. Araujo

University of Chicago ( email )

1101 East 58th Street
Chicago, IL 60637
United States

Chachrit Khunsriraksakul

Pennsylvania State University ( email )

Lida Wang

Pennsylvania State University ( email )

Quan Sun

University of North Carolina (UNC) at Chapel Hill ( email )

Jia Wen

University of North Carolina (UNC) at Chapel Hill ( email )

Lingbo Zhou

University of North Carolina (UNC) at Chapel Hill ( email )

Lynette Ekunwe

University of Mississippi - University of Mississippi Medical Center ( email )

2500 North State Street
Jackson, MS 39216
United States

Leslie A. Lange

University of Colorado, Aurora - Anschutz Medical Campus ( email )

Ethan M. Lange

University of Colorado, Aurora - Anschutz Medical Campus ( email )

Stephen B. Montgomery

Stanford University - Department of Pathology ( email )

291 Campus Drive
Li Ka Shing Building
Stanford, CA 94305-5101
United States

Alexander P. Reiner

University of Washington ( email )

Seattle, WA 98195
United States

Francois Aguet

Broad Institute of MIT and Harvard ( email )

Kristin G. Ardlie

Broad Institute of MIT and Harvard ( email )

Tuuli Lappalainen

New York Genome Center ( email )

New York, NY
United States

Columbia University

3022 Broadway
New York, NY 10027
United States

Royal Institute of Technology (KTH)

Brinellvägen 8
Stockholm, 10044
Sweden

Christopher R. Gignoux

University of Colorado, Aurora - Anschutz Medical Campus ( email )

University of Colorado Denver ( email )

1475 Lawrence St
Denver, CO 80238-3363
United States

Esteban Burchard

University of California, San Francisco (UCSF) ( email )

Kent D. Taylor

University of California, Los Angeles (UCLA) ( email )

Xiuqing Guo

University of California, Los Angeles (UCLA) ( email )

Jerome I. Rotter

University of California, Los Angeles (UCLA) ( email )

Stephen S. Rich

University of Virginia - Center for Public Health Genomics ( email )

1400 University Ave
Charlottesville, VA 22903
United States

Elaine Cornell

University of Vermont - Department of Pathology and Laboratory Medicine ( email )

89 Beaumont avenue
Burlington, VT Vermont 05405
United States

Peter Durda

University of Vermont ( email )

Russell P. Tracy

University of Vermont - Department of Pathology & Laboratory Medicine ( email )

212 Kalkin Hall
Burlington, VT 05405-0158
United States

Yongmei Liu

Duke University ( email )

100 Fuqua Drive
Durham, NC 27708-0204
United States

W. Craig Johnson

University of Washington - Department of Biostatistics ( email )

Seattle, WA
United States

George P. Papanicolaou

Government of the United States of America - National Heart, Lung and Blood Institute ( email )

Minoli A. Perera

Northwestern University ( email )

Michael H. Cho

Harvard University ( email )

Dajiang J. Liu

Pennsylvania State University ( email )

Laura M. Raffield

University of North Carolina (UNC) at Chapel Hill ( email )

Yun Li

University of North Carolina (UNC) at Chapel Hill - Department of Biostatistics ( email )

Chapel Hill, NC 27599
United States

Heather E. Wheeler

Loyola University of Chicago ( email )

Hae Kyung Im

University of Chicago ( email )

Ani Manichaikul

University of Virginia - Center for Public Health Genomics ( email )

1400 University Ave
Charlottesville, VA 22903
United States

Click here to go to Cell.com

Paper statistics

Downloads
12
Abstract Views
107
PlumX Metrics