An Expectation-Maximization Algorithm for Including Oncological COVID-19 Deaths in Survival Analysis
25 Pages Posted: 8 Jan 2022
Date Written: January 2, 2022
Abstract
We address the problem of how COVID-19 deaths observed in an oncology clinical trial can be
consistently taken into account in typical survival estimates. We refer to oncological patients
since there is empirical evidence of strong correlation between Covid and cancer deaths, which
implies that Covid deaths cannot be treated simply as non-informative censorings, a property
usually required by the classical survival estimators. We consider the problem in the framework
of the widely used Kaplan-Meier (KM) estimator. Through a counterfactual approach,
an algorithmic method is developed allowing to include Covid deaths in the observed data by
mean-imputation. The procedure can be seen in the class of the Expectation-Maximization
(EM) algorithms and will be referred to as Covid-Death Mean-Imputation (CoDMI) algorithm.
We discuss the CoDMI underlying assumptions and the convergence issue. The
algorithm provides a completed lifetime data set, where each Covid-death time is replaced
by a point estimate of the corresponding virtual lifetime. This complete data set is naturally
equipped with the corresponding KM survival function estimate and all available statistical
tools can be applied to this data. However, mean-imputation requires an increased variance
of the estimates. We then propose a natural extension of the classical Greenwood's formula,
thus obtaining expanded confidence intervals for the survival function estimate. To illustrate
how the algorithm works, CoDMI is applied to real medical data extended by the addition
of artificial Covid death observations. The results are compared with the estimates provided
by the two naïve approaches which count Covid deaths as censorings or as deaths by the
disease under study. In order to evaluate the predictive performances of CoDMI an extensive
simulation study is carried out. The results indicate that in the simulated scenarios CoDMI
is roughly unbiased and outperforms the estimates obtained by the naïve approaches. A
user-friendly version of CoDMI programmed in R is freely available.
Note:
Funding: This research received no external funding.
Declaration of Interests: The authors declare no conflict of interest.
Keywords: COVID-19, Survival Analysis, Kaplan-Meier estimator, informative censoring, extended Greenwood's formula, EM algorithm, mean-imputation
Suggested Citation: Suggested Citation