lancet-header

Preprints with The Lancet is part of SSRN´s First Look, a place where journals identify content of interest prior to publication. Authors have opted in at submission to The Lancet family of journals to post their preprints on Preprints with The Lancet. The usual SSRN checks and a Lancet-specific check for appropriateness and transparency have been applied. Preprints available here are not Lancet publications or necessarily under review with a Lancet journal. These preprints are early stage research papers that have not been peer-reviewed. The findings should not be used for clinical or public health decision making and should not be presented to a lay audience without highlighting that they are preliminary and have not been peer-reviewed. For more information on this collaboration, see the comments published in The Lancet about the trial period, and our decision to make this a permanent offering, or visit The Lancet´s FAQ page, and for any feedback please contact preprints@lancet.com.

Prediction of the Number of New Cases of 2019 Novel Coronavirus (COVID-19) Using a Social Media Search Index

37 Pages Posted: 24 Mar 2020

See all articles by Lei Qin

Lei Qin

University of International Business and Economics (UIBE) - School of Statistics

Qiang Sun

University of International Business and Economics (UIBE) - School of Statistics

Yidan Wang

University of International Business and Economics (UIBE) - School of Statistics

Ke-Fei Wu

Fu Jen Catholic University - Graduate Institute of Business Administration

Mingchih Chen

Fu Jen Catholic University - Graduate Institute of Business Administration

Ben-Chang Shia

Taipei Medical University - Research Center of Big Data; Taipei Medical University - College of Management; Taipei Medical University - Business Administration in Biotechnology

Szu-Yuan Wu

Fu Jen Catholic University - Graduate Institute of Business Administration; Asia University - Department of Food Nutrition and Health Biotechnology; Lotung Poh-Ai Hospital - Division of Radiation Oncology; Lotung Poh-Ai Hospital - Big Data Center; Asia University - Department of Healthcare Administration; Taipei Medical University - Department of Radiation Oncology

More...

Abstract

Purpose: Predicting the number of novel Coronavirus Disease 2019 (COVID-19) new-suspected or confirmed cases is crucial in the prevention and control of the COVID-19 outbreak.

Methods: Social media search indexes (SMSI) for dry cough, fever, chest distress, coronavirus, and pneumonia were collected from December 31, 2019, to February 9, 2020. The new-suspected cases of COVID-19 data were collected from January 20, 2020, to February 9, 2020. We used the lagged series of SMSI to predict new-suspected COVID-19 case numbers during this period. To avoid overfitting, five methods, namely subset selection, forward selection, lasso regression, ridge regression, and elastic net, were used to estimate coefficients. We selected the optimal method to predict new-suspected COVID-19 case numbers over the subsequent 20 days. We further validated the optimal method for new-confirmed cases of COVID-19 from December 31, 2019, to February 17, 2020.

Results: The new-suspected COVID-19 case numbers were significantly correlated with the lagged series of SMSI. SIMI could be earlier detected 6–9 days than new-suspected cases of COVID-19. The optimal method was subset select, which had the lowest estimation error and a moderate number of predictors. The subset selection method was also significantly correlated with the new-confirmed COVID-19 cases after validation. SMSI findings on lag day 10 were significantly correlated with new-confirmed COVID-19 cases.

Conclusions: SMSI could be a significant predictor of the number of COVID-19 infections.

Funding Statement: Lo-Hsu Medical Foundation, Lotung Poh-Ai Hospital, supports SzuYuan Wu’s work (Funding Number: 10908 and 10909). Lei Qin's work is supported by University of International Business and Economics Huiyuan outstanding young scholars research funding (17YQ15), "the Fundamental Research Funds for the central Universities" in UIBE (CXTD10-10). Ben-Chang Shia’s work was also supported by an institutional grant from Taipei Medical University (Taipei, Taiwan) for New Faculty Research (TMU103-AE1-B22).

Declaration of Interests: The authors have no potential conflicts of interest to declare.

Keywords: social media; COVID-19; predictor; outbreak; new case

Suggested Citation

Qin, Lei and Sun, Qiang and Wang, Yidan and WU, Ke-Fei and Chen, Mingchih and Shia, Ben-Chang and Wu, Szu-Yuan, Prediction of the Number of New Cases of 2019 Novel Coronavirus (COVID-19) Using a Social Media Search Index (3/10/2020). Available at SSRN: https://ssrn.com/abstract=3552829 or http://dx.doi.org/10.2139/ssrn.3552829

Lei Qin

University of International Business and Economics (UIBE) - School of Statistics

Beijing
China

Qiang Sun

University of International Business and Economics (UIBE) - School of Statistics

Beijing
China

Yidan Wang

University of International Business and Economics (UIBE) - School of Statistics

Beijing
China

Ke-Fei WU

Fu Jen Catholic University - Graduate Institute of Business Administration

Taiwan

Mingchih Chen

Fu Jen Catholic University - Graduate Institute of Business Administration

Taiwan

Ben-Chang Shia

Taipei Medical University - Research Center of Big Data

Taiwan

Taipei Medical University - College of Management

Taiwan

Taipei Medical University - Business Administration in Biotechnology

Taiwan

Szu-Yuan Wu (Contact Author)

Fu Jen Catholic University - Graduate Institute of Business Administration ( email )

Asia University - Department of Food Nutrition and Health Biotechnology ( email )

Taiwan

Lotung Poh-Ai Hospital - Division of Radiation Oncology ( email )

Taiwan

Lotung Poh-Ai Hospital - Big Data Center ( email )

Taiwan

Asia University - Department of Healthcare Administration ( email )

Taiwan

Taipei Medical University - Department of Radiation Oncology ( email )

No. 111, Section 3
Hsing-Long Road
Taipei, 116
Taiwan

Click here to go to TheLancet.com

Paper statistics

Downloads
135
Abstract Views
2,028
PlumX Metrics