How to Deal with Missing Categorical Data: Test of a Simple Bayesian Method
Organizational Research Methods, Vol. 6, No. 3, pp. 309-327, 2003
Posted: 20 Aug 2004
We analyze the efficiency of six missing data techniques for categorical item non-response under the assumption that data are missing at random or missing completely at random. With efficiency we mean a procedure that produces an unbiased estimate of true sample properties that, as well, is easy to implement. The investigated techniques include list-wise deletion, mode substitution, random imputation, two regression imputations and a Bayesian model-based procedure. We analyze efficiency under six experimental conditions for a survey-based data set. We find that list-wise deletion is efficient for the data analyzed. If data loss due to list-wise deletion is an issue, the analysis points to the Bayesian method. Regression imputation is also efficient, but the result is conditioned on the specific data structure and may not hold in general. Additional problems arise when using regression imputation making it less appropriate.
Keywords: Missing Categorical Data, Imputation, Bayesian conjugate analysis
JEL Classification: C12, C19, C53, C82
Suggested Citation: Suggested Citation