Preprints with The Lancet is part of SSRN´s First Look, a place where journals identify content of interest prior to publication. Authors have opted in at submission to The Lancet family of journals to post their preprints on Preprints with The Lancet. The usual SSRN checks and a Lancet-specific check for appropriateness and transparency have been applied. Preprints available here are not Lancet publications or necessarily under review with a Lancet journal. These preprints are early stage research papers that have not been peer-reviewed. The findings should not be used for clinical or public health decision making and should not be presented to a lay audience without highlighting that they are preliminary and have not been peer-reviewed. For more information on this collaboration, see the comments published in The Lancet about the trial period, and our decision to make this a permanent offering, or visit The Lancet´s FAQ page, and for any feedback please contact preprints@lancet.com.
Evaluation Of GPT-4 for 10-Year Cardiovascular Risk Prediction: Insights from the UK Biobank and KoGES Data
23 Pages Posted: 28 Sep 2023
More...Abstract
Background: Cardiovascular disease (CVD) is a leading global health concern. Accurate risk prediction is vital for prevention. Traditional models like the Framingham and ACC/AHA risk scores have been clinical mainstays. Recently, artificial intelligence (AI), specifically large language models (LLMs) like GPT-4, have shown promise in the medical field. However, their integration into medical practice is met with both potential and challenges due to reliability and reproducibility concerns.
Methods: We utilized the UK Biobank cohort to outline the 10-year CVD risk through major adverse cardiovascular events (MACE), classified by ICD-10 codes. The Korean Genome and Epidemiology Study (KoGES) served as a validation cohort. We used conventional cardiovascular risk scores and GPT models for our cardiovascular risk predictions. Our models’ robustness was evaluated through various statistical tests.
Findings: From 502,396 UK Biobank participants, 47,468 were selected for analysis after exclusions. In the KoGES cohort, 5,718 out of 10,030 participants were analyzed. Iterative testing identified the optimal temperature setting for GPT-4 as 0.4. At this setting, GPT-4’s AUROC in predicting 10-year MACE was 0.725 for the UK Biobank and 0.664 for KoGES.
Interpretation: Our study demonstrated that LLMs, particularly GPT-4, can provide relatively accurate predictions for CVD risk across ethnically diverse datasets. These models show potential for more extensive applications in healthcare.
Funding: This research was supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant number : HI22C0452).
Declaration of Interest: None to declare.
Keywords: Cardiovascular Risk, GPT-4, ChatGPT, Large Language Model
Suggested Citation: Suggested Citation