Self-supervised learning for volumetric imaging: a prostate cancer biparametric MRI case study
32 Pages Posted: 18 Jun 2024 Last revised: 4 Dec 2024
Date Written: November 30, 2024
Abstract
Background and Objective
Develop two-dimensional self-supervised learning (SSL) models lead to volumetric (three-dimensional) models which can be used in volumetric imaging in and demonstrate their application in volumetric prostate bi-parametric MRI (bpMRI) classification tasks.
Methods
Prostate multiparametric MRI (mpMRI) data from 12 distinct European centres were used to train two SSL methods. We transfer these models to classification tasks in volumetric prostate bpMRI using 3 attention-based multiple instance learning (MIL) methods with T2-weighted (T2) or bpMRI studies. Three prostate cancer (PCa) tasks were considered: PCa diagnosis (D-PCa), clinically significant PCa (csPCa) diagnosis (D-csPCa), and virtual biopsy to confirm csPCa (VB). All approaches were compared with a fully supervised learning (FSL) baseline. Performance was assessed using the area under the receiver operating curve (AUC) and using both 5-fold cross-validation and a hold-out test set. Finally, sensitivity analyses were performed for training and pre-training dataset size, data domain (MRI vs. natural images), and architecture.
Results
Two 2D SSL methods were trained using 6,798 studies (1,722,978 DICOM images) and their downstream performance was assessed on 3D tasks (n=1,622, n=1,615 and n=1,295 bmMRI studies for D-PCa, D-csPCa and VB, respectively). We show these models are comparable or better than FSL baseline models trained on the same data: AUC(SSL)=0.82 and AUC(FSL)=0.75 for bpMRI D-PCa (p=0.017), AUC(SSL)=0.73 and AUC(FSL)=0.68 for T2 D-csPCa (p=0.043) and AUC(SSL)=0.73 and AUC(FSL)=0.65 for bpMRI VB, while other models showed no differences (p>0.05). Learning curve analyses show that SSL-based models required fewer training data to achieve similar performance, while sensitivity analyses showed that large amounts of domain-specific pre-training data are essential for optimal performance performance.
Conclusion
Data with no annotations was used to train SSL models which were more data efficient and performed better than FSL models, highlighting the importance of large-scale data collection efforts in biomedical imaging.
Keywords: self-supervised learning, Multiple-instance learning, prostate multi-parametric MRI
Suggested Citation: Suggested Citation
Almeida, José and Castro Verde, Ana Sofia and Gaivão, Ana and Bilreiro, Carlos and Santiago, Inês and Ip, Joana and Belião, Sara and Matos, Celso and Tsiknakis, Manolis and Marias, Kostas and Regge, Daniele and Papanikolaou, Nikolaos, Self-supervised learning for volumetric imaging: a prostate cancer biparametric MRI case study (November 30, 2024). Available at SSRN: https://ssrn.com/abstract=4864797 or http://dx.doi.org/10.2139/ssrn.4864797
Do you have a job opening that you would like to promote on SSRN?
Feedback
Feedback to SSRN