Preprints with The Lancet is a collaboration between The Lancet Group of journals and SSRN to facilitate the open sharing of preprints for early engagement, community comment, and collaboration. Preprints available here are not Lancet publications or necessarily under review with a Lancet journal. These preprints are early-stage research papers that have not been peer-reviewed. The usual SSRN checks and a Lancet-specific check for appropriateness and transparency have been applied. The findings should not be used for clinical or public health decision-making or presented without highlighting these facts. For more information, please see the FAQs.
Non-NAT Definite Diagnosis Models of COVID-19 Based on Hematological Features
15 Pages Posted: 14 Dec 2020
More...Abstract
Background: Given that 2019 novel coronavirus (COVID-19) spreads rapidly, it is critical to make rapid and accurate detection of COVID-19 patients towards containment of SARS-CoV-2 virus. At present, COVID-19 patients are mainly identified through viral nuclear acid testing (NAT). However, factors such as time for patients being tested, experience of test operators, and specimen’s preparation, might affect the accuracy of testing results. The purpose of this study was to use different classification and feature selection methods to improve the diagnostic accuracy of COVID-19 patients.
Methods: We utilized seven machine learning algorithms for assisting diagnosis of COVID-19 by developing a non-NAT algorithm. In order to reduce the number of input features while maintaining the models’ performance so as to decrease the cost and time consumption, we adopted three algorithms, such as Chi-square test, variance analysis, and feature importance tests to identify the optimal feature sets.
Findings: The XGBoost and RF models displayed the best performance for COVID-19 detection, with the highest accuracy rate more than 0·96. The accuracy of RF model was 0·968 when using only ten hematological features and body temperature.
Interpretation: Ten blood features and body temperature can fairly accurately determine whether a suspected patient is infected with COVID-19. Our model can improve the diagnostic accuracy of COVID-19 and reduce the spread.
Funding: This work is supported by grants from the National Key Research and Development Program of China under Grant 2017YFE0123600, the Natural Science Foundation of China (81873931, 81974382 and 81773104), the Frontier Exploration Program of Huazhong University of Science and Technology (2015TS153), and the Major Scientific and Technological Innovation Projects in Hubei Province (2018ACA136).
Declaration of Interests: All the authors stated that the paper had never been published elsewhere, and that there were no competing economic interests.
Ethics Approval Statement: The collection, use, and retrospective analysis of chest CT images, CFs and SARS-CoV-2 nucleic acid PCR results of patients were approved by the institutional ethical committees of HUST-UH (IRB ID: [2020] IEC(A001)).
Keywords: COVID-19; non-NAT; machine learning; hematological features; optimum feature set
Suggested Citation: Suggested Citation