
Preprints with The Lancet is a collaboration between The Lancet Group of journals and SSRN to facilitate the open sharing of preprints for early engagement, community comment, and collaboration. Preprints available here are not Lancet publications or necessarily under review with a Lancet journal. These preprints are early-stage research papers that have not been peer-reviewed. The usual SSRN checks and a Lancet-specific check for appropriateness and transparency have been applied. The findings should not be used for clinical or public health decision-making or presented without highlighting these facts. For more information, please see the FAQs.
Subnational Variations in the Quality of Population Health Data: A Geospatial Analysis of Household Surveys in Africa
20 Pages Posted: 17 Jul 2023
More...
There are 2 versions of this paper
Subnational Variations in the Quality of Population Health Data: A Geospatial Analysis of Household Surveys in Africa
Subnational Variations in the Quality of Population Health Data: A Geospatial Analysis of Household Surveys in Africa
Abstract
Background: In many low- and middle-income countries, household survey data help address health and development challenges and track achievements towards national objectives including the Sustainable Development Goals (SDGs). Such data are widely used and trusted. Yet users often lack critical information about the extent of data errors where it matters most for human wellbeing – at the district level, where health interventions are usually implemented. To assess the magnitude of such data problems, this study estimates the extent and types of errors in nationally representative household survey data from 33 African countries.
Methods: We conducted a comprehensive high-resolution geospatial analysis of household survey data from the most recent surveys of 33 countries across Africa between 2006 and 2019, using publicly available data from the Demographic and Health Surveys (DHS). We first calculated the prevalence of data errors by survey locations and then employed Bayesian model-based geostatistics using spatially explicit DHS data and covariates from gridded high-resolution datasets. Our model produced 5 × 5-km gridded estimates of three widely used health data quality indicators: age heaping, incomplete age records of interviewed women and biologically implausible height-for-age (HAZ) measures.
Findings: We report two important findings. First, the distribution of errors in survey data across and within Africa was systematic. Errors increased with remoteness. Second, moving beyond the DHS survey locations, our model found substantial heterogeneity in the distribution of errors on subnational levels. For example, the share of incomplete information of women’s age in Chad (national mean 66·1%) ranged from 91·8% (sd 2·5%) in southern Chad to only 6·8% (sd 2·2%) near the eastern border with Sudan.
Interpretation: This is the first study to estimate the subnational distribution of errors in household survey data at a high spatial resolution. Survey data quality degrades with increased remoteness, a phenomenon that adds to the vulnerability of remote populations. Our results illustrate the magnitude of data errors, contribute to SDG target 17.18 on reliable data availability, and promote better targeting of health interventions and data collection efforts within countries.
Funding: VS is supported by the Austrian National Bank’s Anniversary Fund Grant No. 18157. PW would like to acknowledge the support of Feed the Future Food Systems for Innovation Lab, funded by the United States Agency for International Development, Cooperative Agreement No. 7200AA21LE00001. AF is supported by the University of Edinburgh.
Declaration of Interest: The authors declare no competing interests.
Keywords: data quality, household surveys, measurement error, bias, population health surveys, demography, Bayesian analysis
Suggested Citation: Suggested Citation