Radiologic-Radiomic Machine Learning Models for Differentiation of Benign and Malignant Solid Renal Masses: Comparison With Expert-Level Radiologists
Xue-Ying Sun,Qiu-Xia Feng,Xun Xu,Jing Zhang,Fei-Peng Zhu,Yan-Hao Yang,Yu-Dong Zhang
DOI: https://doi.org/10.2214/ajr.19.21617
2020-01-01
American Journal of Roentgenology
Abstract:<b>OBJECTIVE.</b> The objective of our study was to compare the performance of radiologicradiomic machine learning (ML) models and expert-level radiologists for differentiation of benign and malignant solid renal masses using contrast-enhanced CT examinations. <b>MATERIALS AND METHODS.</b> This retrospective study included a cohort of 254 renal cell carcinomas (RCCs) (190 clear cell RCCs [ccRCCs], 38 chromophobe RCCs [chrRCCs], and 26 papillary RCCs [pRCCs]), 26 fat-poor angioleiomyolipomas, and 10 oncocytomas with preoperative CT examinations. Lesions identified by four expert-level radiologists (> 3000 genitourinary CT and MRI studies) were manually segmented for radiologicradiomic analysis. Disease-specific support vector machine radiologic-radiomic ML models for classification of renal masses were trained and validated using a 10-fold cross-validation. Performance values for the expert-level radiologists and radiologic-radiomic ML models were compared using the McNemar test. <b>RESULTS.</b> The performance values for the four radiologists were as follows: sensitivity of 73.7-96.8% (median, 84.5%; variance, 122.7%) and specificity of 48.4-71.9% (median, 61.8%; variance, 161.6%) for differentiating ccRCCs from pRCCs and chrRCCs; sensitivity of 73.7-96.8% (median, 84.5%; variance, 122.7%) and specificity of 52.8-88.9% for differentiating ccRCCs from fat-poor angioleiomyolipomas and oncocytomas (median, 80.6%; variance, 269.1%); and sensitivity of 28.1-60.9% (median, 84.5%; variance, 122.7%) and specificity of 75.0-88.9% for differentiating pRCCs and chrRCCs from fat-poor angioleiomyolipomas and oncocytomas (median, 50.0%; variance, 191.1%). After a 10-fold cross-validation, the radiologic-radiomic ML model yielded the following performance values for differentiating ccRCCs from pRCCs and chrRCCs, ccRCCs from fat-poor angioleiomyolipomas and oncocytomas, and pRCCs and chrRCCs from fat-poor angioleiomyolipomas and oncocytomas: a sensitivity of 90.0%, 86.3%, and 73.4% and a specificity of 89.1%, 83.3%, and 91.7%, respectively. <b>CONCLUSION.</b> Expert-level radiologists had obviously large variances in performance for differentiating benign from malignant solid renal masses. Radiologic-radiomic ML can be a potential way to improve interreader concordance and performance.
radiology, nuclear medicine & medical imaging