
Preprints with The Lancet is a collaboration between The Lancet Group of journals and SSRN to facilitate the open sharing of preprints for early engagement, community comment, and collaboration. Preprints available here are not Lancet publications or necessarily under review with a Lancet journal. These preprints are early-stage research papers that have not been peer-reviewed. The usual SSRN checks and a Lancet-specific check for appropriateness and transparency have been applied. The findings should not be used for clinical or public health decision-making or presented without highlighting these facts. For more information, please see the FAQs.
AI-Driven Ultrasound Detection of Ovarian Cancer that Generalizes: An International Multicentre Validation Study
33 Pages Posted: 22 Jan 2024
More...Abstract
Background: A critical shortage of expert ultrasound examiners has raised concerns of unnecessary interventions and delayed cancer diagnoses. Artificial intelligence (AI)-driven diagnostic support has the potential to alleviate this burden and improve patient outcomes. Deep learning, applied to ultrasound images, has recently demonstrated promising results in ovarian cancer detection; however, external validation is lacking. Our aim was to develop a deep neural network (DNN) model for ovarian cancer detection and evaluate its robustness and ability to generalize across different patient populations in a large multicentre setting. We also sought to assess an AI-assisted triage strategy and compare it to current practice in a retrospective simulation.
Methods: We retrospectively collected 17119 ultrasound images from 3652 women with an ovarian lesion (2224 benign, 1428 malignant) from 20 centres in 8 countries. A total of 2718 cases were externally reviewed by a minimum of 7 expert and 6 non-expert examiners. Using a leave-one-centre-out cross-validation scheme, for each centre in turn, we trained a transformer-based DNN model using data from the remaining 19 centres and compared the models' performance with that of expert and non-expert examiners in terms of accuracy, sensitivity, specificity, and F1 score. Furthermore, we retrospectively simulated and assessed how these models could be used in AI-assisted triage.
Findings: Our models demonstrated robust performance across centres, ultrasound systems, and different histological diagnoses, with an overall area under the receiver operating characteristic curve (AUC) of 0·929 (95% CI, 0·919–0·938), and F1 score of 83·66% (95% CI, 82·25–85·04) on cases from unseen centres. They outperformed both expert and non-expert examiners, with F1 scores of 79·85% (95% CI, 78·32–81·33; Δ = 3·82 [95% CI, 2·42–5·24, p < 0·0001]) and 74·70% (95% CI, 73·03–76·30; Δ = 8·96 [95% CI, 7·40–10·54, p < 0·0001]), respectively. The models were further shown to produce well-calibrated predictions. In a retrospective simulation, AI-assisted diagnostic support reduced the number of referrals to experts by 63%, from 52% of cases (current practice) to 19%, while increasing the diagnostic performance (F1 77·47% vs 83·00%; Δ = 5·54 [95% CI, 4·38–6·69, p < 0·0001]).
Interpretation: Our models exhibit strong generalization and outperform both expert and non-expert examiners in diagnostic accuracy. Introducing AI-driven diagnostic support into the clinical workflow may reduce human resource demands, while improving diagnostic performance.
Funding: Funding has been provided by the Swedish Research Council, the Swedish Cancer Society, the Stockholm Regional Council, and the Wallenberg AI, Autonomous Systems and Software Program (WASP).
Declaration of Interest: EE, KS, FC, EK, and PH have applied for a patent that is pending to a company named Intelligyn. EE, KS, and FC hold stock in Intelligyn, where EE also has an unpaid leadership role. NCP’s institution has received payments for activities not related to this article, including lectures, presentations, expert testimonies, and service on speakers’ bureaus, as well as for travel support. NCP has been an advisory board member of Mindray and Philips Ultrasound and has held unpaid leadership roles in the POGS Organization of Government Institutions (POGI) and the Rizal Medical Service Delivery Network, which are Philippine governmental institutions with the aim to facilitate smooth referral of patients. All other authors declare no competing interests.
Ethical Approval: The study was approved by the Swedish Ethics Review Authority (Dnr 2020-06919).
Keywords: ovarian neoplasms, ultrasonography, diagnostic imaging, artificial intelligence, deep learning, machine learning, triage, clinical decision-making
Suggested Citation: Suggested Citation