Comparison of deep learning algorithms for prostate cancer detection and proposal of a new contrastive learning pre-training approach
Comparison of deep learning algorithms for prostate cancer detection and proposal of a new contrastive learning pre-training approach
Maria Elena Laino, Camille Ruppli, Luc Beuzit, Roberto Ardon, Martin Charachon, Léo Alberge, Guillaume Herpe, Gaspard d'Assignies
Prostate cancer (PCa) is the second most common cancer in men worldwide [1]. In the past years, deep learning (DL) methods have emerged in prostate cancer detection using Magnetic Resonance Image (MRI) [2] but it remains unclear how to standardize their performances and compare each other, since their metrics are calculated in different ways and based on different datasets. For this reason, we screened the available deep learning algorithms for PCa detection in the literature and standardized their performances by training and testing them with the same dataset and the same metrics, to make each DL algorithm comparable with the other. Furthermore, we compared them to the cross-validated performances of our new proposed contrastive learning model. [1] Bray, Freddie et al. “Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.” CA: A Cancer Journal for Clinicians 68 (2018): n. pag. [2] Moore CM, Frangou E, McCartan N on behalf of the Re-Imagine Study group, et al Prevalence of MRI lesions in men responding to a GP-led invitation for a prostate health check: a prospective cohort study BMJ Oncology 2023;2:e000057. doi: 10.1136/bmjonc-2023-000057
1. Literature search We conducted a literature search on Pubmed to identify research studies published since 2010 up to January 2023 The inclusion criteria were: studies focusing on deep learning (DL) models developed for MRI prostate lesion detection, studies including at least T2w and/or DWI sequences, English language, patients with PCa. Articles not focusing on DL methods in PCa, studies on other imaging modality, editorials, review articles, animal or phantom studies were exclued. A total of 850 studies were retrieved from the first literature search. After the subjective screening of all the articles of interest for the aim of our review, 150 articles were included in our study. 2. Standardization of the model performances Our dataset is composed of 2566 MRIs from a private dataset and the 1500 MRIs from the PI-CAI [3] challenge public dataset For training each existing DL method and our newly proposed approach, we used the whole private dataset and two-thirds of the PI-CAI challenge public dataset, and we used the remaining one third as the test set for evaluation. We divided the PI-CAI dataset in 3 parts to allow 3-fold cross-validation. (see Figure 1). 3. New proposed contrastive learning model We took one of the most common architecture for medical image segmentation (U-Net) and proposed an ensemble approach: we averaged the outputs of two networks a 2D U-Net acting on axial slices and a 3D U-Net. Both these networks were pre-trained using a new contrastive learning approach (see Figure 2) and fine-tuned to segment prostate lesions. 4. Comparison The standardized metrics were cross validations of the AUC considering lesion detection (presence of at least one significant lesion) and their overlap with ground truth lesions. After the comparison of the standardized performances of each method, we compared all of them with the performances of our newly proposed method. [3] Saha, Anindo, Twilt, Jasper Jonathan, Bosma, Joeran Sander, van Ginneken, Bram, Yakar, Derya, Elschot, Mattijs, Veltman, Jeroen, Fütterer, Jurgen, de Rooij, Maarten, & Huisman, Henkjan. (2022). The PI-CAI Challenge: Public Training and Development Dataset (v1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6517398
Schematic view of the 3-fold cross-validation approach: each model was trained with the whole private dataset + 2 different thirds of the PI-CAI dataset and evaluated with the remaining one third (process repeated for 3 times).
Schematic view of the contrastive learning approach: a batch of images X1, X2 is given as input, two perturbations are sampled at random to create two different views of the input batch, these two views are fed to two neural networks (f and g). The models are trained to optimize the contrastive loss function which goal is to attract views of the same image while repelling views of different images.
As a premier, identified DL algorithms were trained and evaluated in a standardized way and the performances of the best two (Saha et al [4], Retina Unet [5]) were compared with that of our newly proposed method (table in Figure 3). At an exam level (detection of the presence of at least one lesion), our model outperformed the first method (1) of 0.08 and the second one (2) of 0.17 in AUC performances. At a lesion level, our model outperformed the first method (1) of 0.07 and the second one (2) of 0.09 in AUC performances. In Figure 4, we show practical examples of how our method has better sensitivity and specificity in lesions detection at DWI sequences. [4] Saha, Anindo et al. “End-to-end Prostate Cancer Detection in bpMRI via 3D CNNs: Effect of Attention Mechanisms, Clinical Priori and Decoupled False Positive Reduction.” Medical image analysis 73 (2021): 102155 . [5] Pellicer-Valero, Oscar J. et al. “Deep learning for fully automatic detection, segmentation, and Gleason grade estimation of prostate cancer in multiparametric magnetic resonance images.” Scientific Reports 12 (2021): n. pag.
AUC performance on the test set
Cases where our model helped solve false negative (FN) and false positive (FP)
After standardized training and evaluation of the existing DL algorithms for PCa detection, we propose a novel contrastive learning approach that performs better than previous methods alone.