Convolutional Neural Networks for Automated Classification of Prostate Multiparametric Magnetic Resonance Imaging Based on Image Quality

Stefano Cipollari,Valerio Guarrasi,Martina Pecoraro,Marco Bicchetti,Emanuele Messina,Lorenzo Farina,Paola Paci,Carlo Catalano,Valeria Panebianco
DOI: https://doi.org/10.1002/jmri.27879
Abstract:Background: Prostate magnetic resonance imaging (MRI) is technically demanding, requiring high image quality to reach its full diagnostic potential. An automated method to identify diagnostically inadequate images could help optimize image quality. Purpose: To develop a convolutional neural networks (CNNs) based analysis pipeline for the classification of prostate MRI image quality. Study type: Retrospective. Subjects: Three hundred sixteen prostate mpMRI scans and 312 men (median age 67). Field strength/sequence: A 3 T; fast spin echo T2WI, echo planar imaging DWI, ADC, gradient-echo dynamic contrast enhanced (DCE). Assessment: MRI scans were reviewed by three genitourinary radiologists (V.P., M.D.M., S.C.) with 21, 12, and 5 years of experience, respectively. Sequences were labeled as high quality (Q1) or low quality (Q0) and used as the reference standard for all analyses. Statistical tests: Sequences were split into training, validation, and testing sets (869, 250, and 120 sequences, respectively). Inter-reader agreement was assessed with the Fleiss kappa. Following preprocessing and data augmentation, 28 CNNs were trained on MRI slices for each sequence. Model performance was assessed on both a per-slice and a per-sequence basis. A pairwise t-test was performed to compare performances of the classifiers. Results: The number of sequences labeled as Q0 or Q1 was 38 vs. 278 for T2WI, 43 vs. 273 for DWI, 41 vs. 275 for ADC, and 38 vs. 253 for DCE. Inter-reader agreement was almost perfect for T2WI and DCE and substantial for DWI and ADC. On the per-slice analysis, accuracy was 89.95% ± 0.02% for T2WI, 79.83% ± 0.04% for DWI, 76.64% ± 0.04% for ADC, 96.62% ± 0.01% for DCE. On the per-sequence analysis, accuracy was 100% ± 0.00% for T2WI, DWI, and DCE, and 92.31% ± 0.00% for ADC. The three best algorithms performed significantly better than the remaining ones on every sequence (P-value < 0.05). Data conclusion: CNNs achieved high accuracy in classifying prostate MRI image quality on an individual-slice basis and almost perfect accuracy when classifying the entire sequences. Evidence level: 4 TECHNICAL EFFICACY: Stage 1.
What problem does this paper attempt to address?