MRI-Based Machine Learning for Differentiating Borderline from Malignant Epithelial Ovarian Tumors: A Multicenter Study.
Yong'ai Li,Junming Jian,Perry J. Pickhardt,Fenghua Ma,Wei Xia,Haiming Li,Rui Zhang,Shuhui Zhao,Songqi Cai,Xingyu Zhao,Jiayi Zhang,Guofu Zhang,Jingxuan Jiang,Yan Zhang,Keying Wang,Guangwu Lin,Feng,Jing Lu,Lin Deng,Xiaodong Wu,Jinwei Qiang,Xin Gao
DOI: https://doi.org/10.1002/jmri.27084
IF: 4.4
2020-01-01
Journal of Magnetic Resonance Imaging
Abstract:BACKGROUND Preoperative differentiation of borderline from malignant epithelial ovarian tumors (BEOT from MEOT) can impact surgical management. MRI has improved this assessment but subjective interpretation by radiologists may lead to inconsistent results. PURPOSE To develop and validate an objective MRI-based machine-learning (ML) assessment model for differentiating BEOT from MEOT, and compare the performance against radiologists' interpretation. STUDY TYPE Retrospective study of eight clinical centers. POPULATION In all, 501 women with histopathologically-confirmed BEOT (n = 165) or MEOT (n = 336) from 2010 to 2018 were enrolled. Three cohorts were constructed: a training cohort (n = 250), an internal validation cohort (n = 92), and an external validation cohort (n = 159). FIELD STRENGTH/SEQUENCE Preoperative MRI within 2 weeks of surgery. Single- and multiparameter (MP) machine-learning assessment models were built utilizing the following four MRI sequences: T2 -weighted imaging (T2 WI), fat saturation (FS), diffusion-weighted imaging (DWI), apparent diffusion coefficient (ADC), and contrast-enhanced (CE)-T1 WI. ASSESSMENT Diagnostic performance of the models was assessed for both whole tumor (WT) and solid tumor (ST) components. Assessment of the performance of the model in discriminating BEOT vs. early-stage MEOT was made. Six radiologists of varying experience also interpreted the MR images. STATISTICAL TESTS Mann-Whitney U-test: significance of the clinical characteristics; chi-square test: difference of label; DeLong test: difference of receiver operating characteristic (ROC). RESULTS The MP-ST model performed better than the MP-WT model for both the internal validation cohort (area under the curve [AUC] = 0.932 vs. 0.917) and external validation cohort (AUC = 0.902 vs. 0.767). The model showed capability in discriminating BEOT vs. early-stage MEOT, with AUCs of 0.909 and 0.920, respectively. Radiologist performance was considerably poorer than both the internal (mean AUC = 0.792; range, 0.679-0.924) and external (mean AUC = 0.797; range, 0.744-0.867) validation cohorts. DATA CONCLUSION Performance of the MRI-based ML model was robust and superior to subjective assessment of radiologists. If our approach can be implemented in clinical practice, improved preoperative prediction could potentially lead to preserved ovarian function and fertility for some women. LEVEL OF EVIDENCE Level 4. TECHNICAL EFFICACY Stage 2.