Government of the United States of America - Center for Alzheimer's and Related Dementias (CARD); National Institutes of Health (NIH) - Thoracic and GI Malignancies Branch
Government of the United States of America - Center for Alzheimer's and Related Dementias (CARD); National Institutes of Health (NIH) - Cell Biology and Gene Expression Section
Government of the United States of America - Center for Alzheimer's and Related Dementias (CARD); National Institutes of Health - National Institute of Neurological Disorders and Stroke
Government of the United States of America - Center for Alzheimer's and Related Dementias (CARD); National Institute on Aging (NIA) - Molecular Genetics Section
Government of the United States of America - Center for Alzheimer's and Related Dementias (CARD); National Institutes of Health - Laboratory of Neurogenetics
Longitudinal multi-dimensional biological datasets are ubiquitous and highly abundant. These datasets are essential to understanding disease progression, identifying subtypes, and drug discovery. Discovering meaningful patterns or disease pathophysiologies in these datasets is challenging due to their high dimensionality, making it difficult to visualize hidden patterns. Several methods have been developed for dimensionality reduction, but they are limited to cross-sectional datasets. Recently proposed Aligned-UMAP, an extension of the UMAP algorithm, can visualize high-dimensional longitudinal datasets. In this work, we applied Aligned-UMAP on a broad spectrum of clinical, imaging, proteomics, and single-cell datasets. Aligned-UMAP reveals time-dependent hidden patterns when color-coded with the metadata. We found that the algorithm parameters also play a crucial role and must be tuned carefully to utilize the algorithm's potential fully.
Altogether, based on its ease of use and our evaluation of its performance on different modalities, we anticipate that Aligned-UMAP will be a valuable tool for the biomedical community. We also believe our benchmarking study becomes more important as more and more high-dimensional longitudinal data in biomedical research becomes available.
Dadu, Anant and Satone, Vipul and Kaur, Rachneet and Koretsky, Mathew and Iwaki, Hirotaka and Qi, Yue and Ramos, Daniel M. and Avants, Brian and Hesterman, Jacob and Gunn, Roger and Cookson, Mark R. and Ward, Michael E. and Singleton, Andrew B. and Campbell, Roy H. and Nalls, Michael A. and Faghri, Faraz, Application of Aligned-UMAP to Longitudinal Biomedical Studies. Available at SSRN: https://ssrn.com/abstract=4292603 or http://dx.doi.org/10.2139/ssrn.4292603
This version of the paper has not been formally peer reviewed.