lancet-header

Preprints with The Lancet is part of SSRN´s First Look, a place where journals identify content of interest prior to publication. Authors have opted in at submission to The Lancet family of journals to post their preprints on Preprints with The Lancet. The usual SSRN checks and a Lancet-specific check for appropriateness and transparency have been applied. Preprints available here are not Lancet publications or necessarily under review with a Lancet journal. These preprints are early stage research papers that have not been peer-reviewed. The findings should not be used for clinical or public health decision making and should not be presented to a lay audience without highlighting that they are preliminary and have not been peer-reviewed. For more information on this collaboration, see the comments published in The Lancet about the trial period, and our decision to make this a permanent offering, or visit The Lancet´s FAQ page, and for any feedback please contact preprints@lancet.com.

Understanding the Quality of Ethnicity Data Recorded in Health-Related Administrative Data Sources Compared with Census 2021 in England

29 Pages Posted: 1 Apr 2024

See all articles by Cameron Razieh

Cameron Razieh

University of Leicester

Bethan Powell

Government of the United Kingdom - Office for National Statistics

Rosemary Drummond

Government of the United Kingdom - Office for National Statistics

Isobel Ward

Government of the United Kingdom - Office for National Statistics; Government of the United Kingdom - Health Analysis and Life Events Division

Jasper Morgan

Government of the United Kingdom - Health Analysis and Life Events Division

Myer Glickman

Government of the United Kingdom - Office for National Statistics; Government of the United Kingdom - Health Analysis and Life Events Division

Chris White

Government of the United Kingdom - Office for National Statistics

Francesco Zaccardi

University of Leicester

Jonathan Hope

NHS England and NHS Improvement

Veena Raleigh

King's Fund

Ashley Akbari

Swansea University - Population Data Science

Nazrul Islam

University of Southampton

Thomas Yates

University of Leicester

Lisa Murphy

Wellcome Trust

Bilal Mateen

PATH

Kamlesh Khunti

University of Leicester - Leicester Diabetes Centre

Vahe Nafilyan

Government of the United Kingdom - Health Analysis and Life Events Division

More...

Abstract

Background: Electronic health records (EHR) are increasingly used to investigate health inequalities across ethnic groups. Whilst there are some studies showing that the recording of ethnicity in EHR is imperfect, there is no robust evidence on the accuracy between the ethnicity information recorded in various real-world sources and census data.

Methods: We linked primary and secondary care NHS England data sources with Census 2021 and compared individual-level agreement of ethnicity recording in General Practice Extraction Service (GPES) Data for Pandemic Planning and Research (GDPPR), Hospital Episode Statistics (HES), Ethnic Category Information Asset (ECIA) and Talking Therapies for anxiety and depression (TT) with ethnicity reported in census. Census ethnicity is self-reported and, therefore, regarded as the most reliable population-level source of ethnicity recording. We further assessed the impact of multiple approaches to assigning a person an ethnic category.

Findings: The number of people that could be linked to census from ECIA, GDPPR, HES and TT were 47.4m, 43.5m, 47.8m and 6.3m, respectively. Across all four data sources, the White British category had the highest level of agreement with census (≥96%), followed by the Bangladeshi category (≥93%). Levels of agreement for Pakistani, Indian, and Chinese categories were ≥87%, ≥82% and ≥80% across all sources. Agreement was lower for Mixed (≤75%) and Other (≤71%) categories across all data sources. The categories with the lowest agreement were Gypsy or Irish Traveller category (≤6%), Other Black (≤19%) and Any Other Ethnic Group (≤25%).

Interpretation: Certain ethnic categories across all data sources have high discordance with census ethnic categories. These differences may lead to biased estimates of differences in health outcomes between ethnic groups, a critical data point used when making health policy and planning decisions.

Funding: Wellcome Trust.

Declaration of Interest: The author(s) from the Wellcome Trust conceived this project, and were also its funder(s). Specifically, BAM was the accountable individual for this piece of commissioned research. The eventual scope of the project described by the manuscript was collaboratively agreed by the team led by CR, BC, RD, IW, VN and MG, and the authors from Wellcome. The Wellcome authors contributed to the intellectual development of the idea contained in this manuscript, and its drafting, and thus are credited as co- authors in keeping with the ICMJE guidelines. KK is Director of the Centre for Ethnic Health Resarch at University of Leicester and Co-Chair of the Ethnicity Coding Group for HDRUK. Authors from ONS declare no competing interests.

Ethical Approval: Ethical approval was obtained from the National Statistician’s Data Ethics Advisory Committee (NSDEC(20)12). This study involved secondary use of administrative datasets. Therefore, informed consent was not required.

Keywords: Ethnicity, coding, electronic health records, administrative data, data quality

Suggested Citation

Razieh, Cameron and Powell, Bethan and Drummond, Rosemary and Ward, Isobel and Morgan, Jasper and Glickman, Myer and White, Chris and Zaccardi, Francesco and Hope, Jonathan and Raleigh, Veena and Akbari, Ashley and Islam, Nazrul and Yates, Thomas and Murphy, Lisa and Mateen, Bilal and Khunti, Kamlesh and Nafilyan, Vahe, Understanding the Quality of Ethnicity Data Recorded in Health-Related Administrative Data Sources Compared with Census 2021 in England. Available at SSRN: https://ssrn.com/abstract=4775800 or http://dx.doi.org/10.2139/ssrn.4775800

Cameron Razieh (Contact Author)

University of Leicester ( email )

University Road
Leicester, LE1 7RH
United Kingdom

Bethan Powell

Government of the United Kingdom - Office for National Statistics ( email )

London, SW1A 2AA
United Kingdom

Rosemary Drummond

Government of the United Kingdom - Office for National Statistics ( email )

London, SW1A 2AA
United Kingdom

Isobel Ward

Government of the United Kingdom - Office for National Statistics ( email )

London, SW1A 2AA
United Kingdom

Government of the United Kingdom - Health Analysis and Life Events Division ( email )

Jasper Morgan

Government of the United Kingdom - Health Analysis and Life Events Division ( email )

Myer Glickman

Government of the United Kingdom - Office for National Statistics ( email )

Government of the United Kingdom - Health Analysis and Life Events Division ( email )

Chris White

Government of the United Kingdom - Office for National Statistics ( email )

London, SW1A 2AA
United Kingdom

Francesco Zaccardi

University of Leicester ( email )

University Road
Leicester, LE1 7RH
United Kingdom

Jonathan Hope

NHS England and NHS Improvement ( email )

Veena Raleigh

King's Fund ( email )

Ashley Akbari

Swansea University - Population Data Science ( email )

Nazrul Islam

University of Southampton ( email )

Thomas Yates

University of Leicester ( email )

University Road
Leicester, LE1 7RH
United Kingdom

Lisa Murphy

Wellcome Trust ( email )

Gibbs Building
215 Euston Road
London, NW1 2BE
United Kingdom

Kamlesh Khunti

University of Leicester - Leicester Diabetes Centre ( email )

Leicester
United Kingdom

Vahe Nafilyan

Government of the United Kingdom - Health Analysis and Life Events Division ( email )

United Kingdom