Development and Validation of Machine Learning–based Model for the Prediction of Malignancy in Multiple Pulmonary Nodules: Analysis from Multicentric Cohorts
Kezhong Chen,Yuntao Nie,Samina Park,Kai Zhang,Yangming Zhang,Yuan Liu,Bengang Hui,Lixin Zhou,Xun Wang,Qingyi Qi,Hao Li,Guannan Kang,Yuqing Huang,Yingtai Chen,Jiabao Liu,Jian Cui,Mingru Li,In Kyu Park,Chang Hyun Kang,Haifeng Shen,Yingshun Yang,Tian Guan,Yaxiao Zhang,Fan Yang,Young Tae Kim,Jun Wang
DOI: https://doi.org/10.1158/1078-0432.ccr-20-4007
IF: 13.801
2021-02-24
Clinical Cancer Research
Abstract:Abstract Purpose: Nodule evaluation is challenging and critical to diagnose multiple pulmonary nodules (MPNs). We aimed to develop and validate a machine learning–based model to estimate the malignant probability of MPNs to guide decision-making. Experimental Design: A boosted ensemble algorithm (XGBoost) was used to predict malignancy using the clinicoradiologic variables of 1,739 nodules from 520 patients with MPNs at a Chinese center. The model (PKU-M model) was trained using 10-fold cross-validation in which hyperparameters were selected and fine-tuned. The model was validated and compared with solitary pulmonary nodule (SPN) models, clinicians, and a computer-aided diagnosis (CADx) system in an independent transnational cohort and a prospective multicentric cohort. Results: The PKU-M model showed excellent discrimination [area under the curve; AUC (95% confidence interval (95% CI)), 0.909 (0.854–0.946)] and calibration (Brier score, 0.122) in the development cohort. External validation (583 nodules) revealed that the AUC of the PKU-M model was 0.890 (0.859–0.916), higher than those of the Brock model [0.806 (0.771–0.838)], PKU model [0.780 (0.743–0.817)], Mayo model [0.739 (0.697–0.776)], and VA model [0.682 (0.640–0.722)]. Prospective comparison (200 nodules) showed that the AUC of the PKU-M model [0.871 (0.815–0.915)] was higher than that of surgeons [0.790 (0.711–0.852), 0.741 (0.662–0.804), and 0.727 (0.650–0.788)], radiologist [0.748 (0.671–0.814)], and the CADx system [0.757 (0.682–0.818)]. Furthermore, the model outperformed the clinicians with an increase of 14.3% in sensitivity and 7.8% in specificity. Conclusions: After its development using machine learning algorithms, validation using transnational multicentric cohorts, and prospective comparison with clinicians and the CADx system, this novel prediction model for MPNs presented solid performance as a convenient reference to help decision-making.
oncology