Multi-Ancestry Transcriptome Prediction with Functionally Informed Variants in TOPMed MESA Improves Performance of Transcriptome-Wide Association Studies
34 Pages Posted: 28 Mar 2025 Publication Status: Under Review
More...Abstract
Reliable reference transcriptome prediction models are key to accurate transcriptome-wide association study (TWAS). With the emergence of multi-ancestry genome-wide association study (GWAS), there is a need for reliable multi-ancestry transcriptome prediction models for downstream TWAS efforts. Here, we propose three methods leveraging functionally informed variants (FIVs), hereinafter referred to as FIV-based methods, that are more likely to influence gene expression to improve multi-ancestry TWAS. We trained transcriptome prediction models on 1,287 multi-ancestry participants from the Trans-Omics for Precision Medicine (TOPMed) program Multi-Ethnic Study of Atherosclerosis (MESA) with RNA-seq data from peripheral blood mononuclear cells (PBMCs). We validated models’ prediction accuracy on two external independent data sets, Geuvadis and the Jackson Heart Study (JHS). To test robustness of our FIV-based methods for multi-ancestry TWAS, we integrated developed transcriptome prediction models with three large-scale multi-ancestry GWASs from blood cell, lipid, and pulmonary function traits, respectively. Our FIV-based methods presented similar prediction accuracy but with a smaller and more accurate set of variants compared to the benchmark method, Elastic Net. Additionally, our FIV-based methods achieved significantly higher TWAS power for three GWAS traits (P<0.05 from Mann-Whitney U test) and produced higher TWAS accuracy by F1 score for all GWAS traits except two blood cell traits (with average improved accuracy of 24% over EN). However, no single proposed method outperformed in all GWAS traits. To further improve the TWAS performance, we propose an omnibus approach that aggregates TWAS summary statistics from our FIV-based methods. The omnibus approach yielded the highest number of Bonferroni-significant TWAS genes for all GWAS traits, and it further improved TWAS power and accuracy for blood cell traits. Additionally, the omnibus approach detected some trait-relevant important genes that the EN missed. We provided three examples in the manuscript for the demonstration of improvement from our omnibus approach. Our study demonstrates the value of including FIVs in multi-ancestry transcriptome prediction models for improving TWAS performance. Further, the improvement of TWAS performance depends on the GWAS trait’s relevance to the tissue or cell-type used to build transcriptome prediction models.
Keywords: Transcriptome prediction models, transcriptome-wide association study, functional annotation, multi-ancestry
Suggested Citation: Suggested Citation