lancet-header

Preprints with The Lancet is part of SSRN´s First Look, a place where journals identify content of interest prior to publication. Authors have opted in at submission to The Lancet family of journals to post their preprints on Preprints with The Lancet. The usual SSRN checks and a Lancet-specific check for appropriateness and transparency have been applied. Preprints available here are not Lancet publications or necessarily under review with a Lancet journal. These preprints are early stage research papers that have not been peer-reviewed. The findings should not be used for clinical or public health decision making and should not be presented to a lay audience without highlighting that they are preliminary and have not been peer-reviewed. For more information on this collaboration, see the comments published in The Lancet about the trial period, and our decision to make this a permanent offering, or visit The Lancet´s FAQ page, and for any feedback please contact preprints@lancet.com.

Automated Machine Learning Optimizes and Accelerates COVID-19 Predictive Modeling

33 Pages Posted: 23 Apr 2021

See all articles by Georgios Papoutsoglou

Georgios Papoutsoglou

Gnosis Data Analysis PC - JADBio

Makrina Karaglani

Gnosis Data Analysis PC - JADBio

Vincenzo Lagani

Ilia State University

Naomi Thomson

Gnosis Data Analysis PC - JADBio

Oluf Dimitri Røe

Norwegian University of Science and Technology (NTNU) - Department of Clinical and Molecular Medicine,

Ioannis Tsamardinos

Gnosis Data Analysis PC - JADBio

Ekaterini Chatzaki

Democritus University of Thrace

More...

Abstract

The rapid outbreak of COVID-19 brings intense pressure on healthcare systems, with an urgent demand for effective diagnostic, prognostic and therapeutic procedures. Despite the global scientific effort, there is lack of efficient predictive models for patient stratification and successful management of the disease.

Here, we employed Automated Machine Learning (AutoML) to analyze 3 publicly available COVID-19 datasets, including serum proteomic, metabolomic and transcriptomic measurements. Pathway analysis of the selected features was also performed.

Analysis of a combined proteomic and metabolomic dataset produced ten equivalent signatures of two features each, with AUC 0.840(CI 0.723 – 0.941) in discriminating severe from non-severe COVID-19 patients. A transcriptomic dataset led to two equivalent signatures of eight features each with AUC 0.914(CI 0.865 - 0.955) in identifying COVID-19 patients from those with a different acute respiratory illness. A second transcriptomic dataset led to two equivalent signatures of nine features each with AUC 0.967(CI 0.899 - 0.996) in identifying COVID-19 patients from virus-free individuals. Multiple new features emerged implicated in a wide range of pathways including viral mRNA translation pathways, interferon gamma signaling and Innate Immune System.

In conclusion, by application of AutoML multiple biosignatures were built in a fast automated way, presenting reduced feature number and high predictive performance that remained high upon validation. These favorable characteristics are eminent for further development of cost-effective clinical assays to contribute to better disease management. Our results also highlight the importance of revisiting precious and well-built datasets for maximal conclusion extraction from a given experimental observation.

Funding Statement: No funding was received for this research.

Declaration of Interests: GP, MK, and NT are employees of Gnosis Data Analysis that offers the JADBio service commercially. IT and VL are co-founders of Gnosis Data Analysis that offers the JADBio service commercially and members of its scientific advisory board.

Keywords: COVID-19, automated Machine Learning, SARS-CoV-2, modeling, predictive models, validation

Suggested Citation

Papoutsoglou, Georgios and Karaglani, Makrina and Lagani, Vincenzo and Thomson, Naomi and Røe, Oluf Dimitri and Tsamardinos, Ioannis and Chatzaki, Ekaterini, Automated Machine Learning Optimizes and Accelerates COVID-19 Predictive Modeling. Available at SSRN: https://ssrn.com/abstract=3787455 or http://dx.doi.org/10.2139/ssrn.3787455

Georgios Papoutsoglou

Gnosis Data Analysis PC - JADBio ( email )

Crete
Greece

Makrina Karaglani (Contact Author)

Gnosis Data Analysis PC - JADBio ( email )

Crete
Greece

Vincenzo Lagani

Ilia State University

Kakutsa Cholokashvili Ave 3/5
Tbilisi, 0162
Georgia

Naomi Thomson

Gnosis Data Analysis PC - JADBio ( email )

Crete
Greece

Oluf Dimitri Røe

Norwegian University of Science and Technology (NTNU) - Department of Clinical and Molecular Medicine, ( email )

Høgskoleringen
Trondheim NO-7491, 7491
Norway

Ioannis Tsamardinos

Gnosis Data Analysis PC - JADBio ( email )

Crete
Greece

Ekaterini Chatzaki

Democritus University of Thrace

Click here to go to TheLancet.com

Paper statistics

Abstract Views
505
Downloads
61
PlumX Metrics