Pre-Hospital Prediction of Adverse Outcomes in Patients with Suspected COVID-19: Development, Application and Comparison of Machine Learning and Deep Learning Methods
23 Pages Posted: 25 Mar 2022
The severe acute respiratory syndrome coronavirus 2 (SARS-Co-2) infected millions of people with COVID-19 worldwide and led to increased mortality globally. Patients with suspected COVID-19 attended emergency departments (EDs) and utilised emergency medical services (EMS), leading to increased pressures and waiting times. Rapid and more accurate decision-making is required to identify patients at high-risk of clinical deterioration and mortality, so that scarce resources and services can be allocated appropriately. The aim of our study was to develop machine learning (ML) and deep artificial intelligence (AI) models to predict adverse outcomes in patients with suspected COVID-19 and compare these with traditional statistical methods (logistic regression). We used linked ambulance service data for patients with suspected COVID-19 infection attended by EMS crews in the Yorkshire and Humber region of England from 18th March 2020 to 29th June 2020. The primary outcome was death or need for organ support within 30 days. We trained sup- port vector machines (SVM), extreme gradient boosting (XGB), artificial neural networks (ANN) models and, for comparative purposes, with logistic regression (LR), to predict the primary outcome. We used stacking ensemble learning methods for improving the prediction of adverse outcomes. Performance of the individual and ensemble models was compared with two baselines: the decision made by EMS clinicians whether to convey patients to hospital, and the PRIEST clinical severity score (Goodacre et al. (2021)). Data were obtained from 7,549 adult patients who were attended by EMS clinicians: of these, 1,330 patients (17.6%) experienced the primary outcome (death or organ support). The three ML methods showed slight improvements in sensitivity over baseline results. Further improvements were obtained using stacking ensemble methods, the best geometric mean (GM) results were obtained using SVM and ANN as base learners when maximising sen- sitivity and specificity. The different ML/AI methods varied in the relative importance they gave to features in the predictive models. Further work is required to test the models externally, to develop an automated system that could be used in clinical settings and to study the clinical effectiveness of such a system.
Funding Information: UK National Institute for Health Research, Health Technology Assessment (HTA) programme (project reference 11/46/07)
Conflict of Interests: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Ethical Approval: The North West—Haydock Research Ethics Committee gave a favourable opinion on the PAINTED study on 25th June 2012 (reference 12/NW/0303) and on the updated PRIEST study on 23rd March 2020, including the analysis presented here. The Confiden- tiality Advisory Group of the NHS Health Research Authority granted approval to collect data without patient consent in line with Section 251 of the National Health Service Act 2006. Access to data collected by NHS Digital was recommended for approval by its Inde- pendent Group Advising on the Release of Data (IGARD) on 11th September 2021 having received additional recommendation for access to GP records from the Profession Advisory Group (PAG) on 19th August 2021.
Keywords: COVID-19, outcomes, support vector machines, extreme gradient boosting, artificial neural networks, stacking ensemble
Suggested Citation: Suggested Citation