An Expectation-Maximization Algorithm for Including Oncological COVID-19 Deaths in Survival Analysis

25 Pages Posted: 8 Jan 2022

See all articles by Francesca De Felice

Francesca De Felice

Sapienza University of Rome - Department of Earth Sciences and Forecasting Research Center, Prevention and Control of Geological Risks

Luca Mazzoni

University of Florence - Dipartimento di Statistica, Informatica, Applicazioni (DiSIA); Alef-servizi spa

Franco Moriconi

University of Perugia - Department of Economics

Date Written: January 2, 2022

Abstract

We address the problem of how COVID-19 deaths observed in an oncology clinical trial can be
consistently taken into account in typical survival estimates. We refer to oncological patients
since there is empirical evidence of strong correlation between Covid and cancer deaths, which
implies that Covid deaths cannot be treated simply as non-informative censorings, a property
usually required by the classical survival estimators. We consider the problem in the framework
of the widely used Kaplan-Meier (KM) estimator. Through a counterfactual approach,
an algorithmic method is developed allowing to include Covid deaths in the observed data by
mean-imputation. The procedure can be seen in the class of the Expectation-Maximization
(EM) algorithms and will be referred to as Covid-Death Mean-Imputation (CoDMI) algorithm.
We discuss the CoDMI underlying assumptions and the convergence issue. The
algorithm provides a completed lifetime data set, where each Covid-death time is replaced
by a point estimate of the corresponding virtual lifetime. This complete data set is naturally
equipped with the corresponding KM survival function estimate and all available statistical
tools can be applied to this data. However, mean-imputation requires an increased variance
of the estimates. We then propose a natural extension of the classical Greenwood's formula,
thus obtaining expanded confidence intervals for the survival function estimate. To illustrate
how the algorithm works, CoDMI is applied to real medical data extended by the addition
of artificial Covid death observations. The results are compared with the estimates provided
by the two naïve approaches which count Covid deaths as censorings or as deaths by the
disease under study. In order to evaluate the predictive performances of CoDMI an extensive
simulation study is carried out. The results indicate that in the simulated scenarios CoDMI
is roughly unbiased and outperforms the estimates obtained by the naïve approaches. A
user-friendly version of CoDMI programmed in R is freely available.

Note:
Funding: This research received no external funding.

Declaration of Interests: The authors declare no conflict of interest.

Keywords: COVID-19, Survival Analysis, Kaplan-Meier estimator, informative censoring, extended Greenwood's formula, EM algorithm, mean-imputation

Suggested Citation

De Felice, Francesca and Mazzoni, Luca and Moriconi, Franco, An Expectation-Maximization Algorithm for Including Oncological COVID-19 Deaths in Survival Analysis (January 2, 2022). Available at SSRN: https://ssrn.com/abstract=3998500 or http://dx.doi.org/10.2139/ssrn.3998500

Francesca De Felice

Sapienza University of Rome - Department of Earth Sciences and Forecasting Research Center, Prevention and Control of Geological Risks ( email )

Rome
Italy

Luca Mazzoni

University of Florence - Dipartimento di Statistica, Informatica, Applicazioni (DiSIA) ( email )

Viale Morgagni, 59
Florence, 50134
Italy

Alef-servizi spa ( email )

Rome (Italia)
Italy

Franco Moriconi (Contact Author)

University of Perugia - Department of Economics ( email )

via Pascoli, 20
Perugia, 06123
Italy

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
41
Abstract Views
261
PlumX Metrics