Comparison of nomogram and machine‐learning methods for predicting the survival of non‐small cell lung cancer patients

Haike Lei,Xiaosheng Li,Wuren Ma,Na Hong,Chun Liu,Wei Zhou,Hong Zhou,Mengchun Gong,Ying Wang,Guixue Wang,Yongzhong Wu
DOI: https://doi.org/10.1002/cai2.24
2022-09-02
Cancer Innovation
Abstract:Thirteen clinical variables related to survival status were selected for modeling and analysis. During our observation period, nomograms provided a more reliable prognostic assessment of NSCLC patients compared to machine‐learning models. In practical clinical applications, an integrated model combining these two approaches may demonstrate superior capabilities. Background Most patients with advanced non‐small cell lung cancer (NSCLC) have a poor prognosis. Predicting overall survival using clinical data would benefit cancer patients by allowing providers to design an optimum treatment plan. We compared the performance of nomograms with machine‐learning models at predicting the overall survival of NSCLC patients. This comparison benefits the development and selection of models during the clinical decision‐making process for NSCLC patients. Methods Multiple machine‐learning models were used in a retrospective cohort of 6586 patients. First, we modeled and validated a nomogram to predict the overall survival of NSCLC patients. Subsequently, five machine‐learning models (logistic regression, random forest, XGBoost, decision tree, and light gradient boosting machine) were used to predict survival status. Next, we evaluated the performance of the models. Finally, the machine‐learning model with the highest accuracy was chosen for comparison with the nomogram at predicting survival status by observing a novel performance measure: time‐dependent prediction accuracy. Results Among the five machine‐learning models, the accuracy of random forest model outperformed the others. Compared with the nomogram for time‐dependent prediction accuracy with a follow‐up time ranging from 12 to 60 months, the prediction accuracies of both the nomogram and machine‐learning models changed as time varied. The nomogram reached a maximum prediction accuracy of 0.85 in the 60th month, and the random forest algorithm reached a maximum prediction accuracy of 0.74 in the 13th month. Conclusions Overall, the nomogram provided more reliable prognostic assessments of NSCLC patients than machine‐learning models over our observation period. Although machine‐learning methods have been widely adopted for predicting clinical prognoses in recent studies, the conventional nomogram was competitive. In real clinical applications, a comprehensive model that combines these two methods may demonstrate superior capabilities.
What problem does this paper attempt to address?