Identifying Type 1 and 2 Diabetes in Population Level Data: Assessing the Accuracy of Published Approaches

Aims: Population datasets are increasingly used to study type 1 (T1D) or 2 diabetes (T2D), and inform clinical practice but correctly classifying diabetes type, when insulin treated, is challenging. We aimed to compare the performance of approaches for classifying insulin treated diabetes for research studies, evaluated against two independent biological definitions of diabetes type.

Method: We compared accuracy of thirteen reported approaches for classifying insulin treated diabetes into T1D and T2D in two population cohorts with diabetes: UK Biobank (UKBB) n=26,399 and DARE n=1,296. Overall accuracy and predictive values for classifying T1D and T2D were assessed using: 1) a T1D genetic risk score and genetic stratification method (UKBB); 2) C-peptide measured at >3 years diabetes duration (DARE).

Results: Accuracy of approaches ranged from 71%-88% in UKBB and 68%-88% in DARE. When classifying all participants, combining early insulin requirement with a T1D probability model incorporating continuous clinical features (diagnosis age and BMI only) consistently achieved high accuracy, (UKBB 87%, DARE 85%). Self-reported diabetes type alone had high accuracy (UKBB 87%, DARE 88%) but was available in just 15% of UKBB participants. For identifying T1D with minimal misclassification, using models with high thresholds or young age at diagnosis (<20 years) had the highest performance. An online tool developed from all UKBB findings identifies the optimum approach of those tested based on variable availability and the research aim.

Conclusion: Self-reported diagnosis and models combining continuous features with early insulin requirement are the most accurate methods of classifying insulin treated diabetes in research datasets without measured classification biomarkers.

Note:

Funding Information: The Diabetes Alliance for Research in England (DARE) study was funded by the Wellcome Trust and supported by the Exeter NIHR Clinical Research Facility. NJT is funded by a Wellcome Trust funded GW4 PhD. AM is supported by a National Institute for Health Research (NIHR) Academic Clinical Fellowship. M.N.W. is supported by the Wellcome Trust Institutional Support Fund (WT097835MF). SAS is supported by a Diabetes UK PhD studentship (17/0005757). JMD is supported by an Independent Fellowship funded by Research England’s Expanding Excellence in England (E3) fund. KGY is supported by Research England’s Expanding Excellence in England (E3) fund. ATH is supported by the NIHR Exeter Clinical Research Facility and a Wellcome Senior Investigator award and an NIHR Senior Investigator award. AGJ was supported by an NIHR Clinician Scientist award (CS-2015-15-018).

Declaration of Interests: AGJ contributed to the development of the two classification models assessed in this work. Other authors declare that there are no relationships or activities that might bias, or be perceived to bias, their work.

Ethical Approval Statement: Ethics for the Diabetes Alliance for Research in England (DARE) study was granted by the Devon & Torbay Research Ethics Committee, ref: 2002/7/118.

Keywords: Diabetes Classification , Population Studies,  Cohort stratification

Suggested Citation: Suggested Citation

Thomas, Nicholas J. and McGovern, Andrew and Young, Katherine and Sharp, Seth A. and Weedon, Michael N. and Hattersley, Andrew and Dennis, John and Jones, Angus G., Identifying Type 1 and 2 Diabetes in Population Level Data: Assessing the Accuracy of Published Approaches. Available at SSRN: https://ssrn.com/abstract=4125231 or http://dx.doi.org/10.2139/ssrn.4125231