Performance of Artificial Intelligence-Aided Diagnosis System for Clinically Significant Prostate Cancer with MRI: A Diagnostic Comparison Study

Ke-Wen Jiang,Yang Song,Ying Hou,Rui Zhi,Jing Zhang,Mei-Ling Bao,Hai Li,Xu Yan,Wei Xi,Cheng-Xiu Zhang,Ye-Feng Yao,Guang Yang,Yu-Dong Zhang
DOI: https://doi.org/10.1002/jmri.28427
2022-01-01
Abstract:Background The high level of expertise required for accurate interpretation of prostate MRI. Purpose To develop and test an artificial intelligence (AI) system for diagnosis of clinically significant prostate cancer (CsPC) with MRI. Study Type Retrospective. Subjects One thousand two hundred thirty patients from derivation cohort between Jan 2012 and Oct 2019, and 169 patients from a publicly available data (U-Net: 423 for training/validation and 49 for test and TrumpeNet: 820 for training/validation and 579 for test). Field Strength/Sequence 3.0T/scanners, T-2-weighted imaging (T2WI), diffusion-weighted imaging, and apparent diffusion coefficient map. Assessment Close-loop AI system was trained with an Unet for prostate segmentation and a TrumpetNet for CsPC detection. Performance of AI was tested in 410 internal and 169 external sets against 24 radiologists categorizing into junior, general and subspecialist group. Gleason score >6 was identified as CsPC at pathology. Statistical Tests Area under the receiver operating characteristic curve (AUC-ROC); Delong test; Meta-regression I-2 analysis. Results In average, for internal test, AI had lower AUC-ROC than subspecialists (0.85 vs. 0.92, P < 0.05), and was comparable to junior (0.84, P = 0.76) and general group (0.86, P = 0.35). For external test, both AI (0.86) and subspecialist (0.86) had higher AUC than junior (0.80, P < 0.05) and general reader (0.83, P < 0.05). In individual, it revealed moderate diagnostic heterogeneity in 24 readers (Mantel-Haenszel I-2 = 56.8%, P < 0.01), and AI outperformed 54.2% (13/24) of readers in summary ROC analysis. In multivariate test, Gleason score, zonal location, PI-RADS score and lesion size significantly impacted the accuracy of AI; while effect of data source, MR device and parameter settings on AI performance is insignificant (P > 0.05). Data Conclusion Our AI system can match and to some case exceed clinicians for the diagnosis of CsPC with prostate MRI. Evidence Level 3 Technical Efficacy Stage 2
What problem does this paper attempt to address?