Machine learning for predicting liver and/or lung metastasis in colorectal cancer: a retrospective study based on the SEER database

Zhentian Guo,Zongming Zhang,Limin Liu,Yue Zhao,Zhuo Liu,Chong Zhang,Hui Qi,Jinqiu Feng,Chunmin Yang,Weiping Tai,Filippo Banchini,Riccardo Inchingolo
DOI: https://doi.org/10.1016/j.ejso.2024.108362
IF: 4.037
2024-04-29
European Journal of Surgical Oncology
Abstract:Objective This study aims to establish a machine learning (ML) model for predicting the risk of liver and/or lung metastasis in colorectal cancer (CRC). Methods Using the National Institutes of Health (NIH)'s Surveillance, Epidemiology, and End Results (SEER) database, a total of 51265 patients with pathological diagnosis of colorectal cancer from 2010 to 2015 were extracted for model development. On this basis, We have established 7 machine learning algorithm models. Evaluate the model based on accuracy, and AUC of receiver operating characteristics (ROC) and explain the relationship between clinical pathological features and target variables based on the best model. We validated the model among 196 colorectal cancer patients in Beijing Electric Power Hospital of Capital Medical University of China to evaluate its performance and universality. Finally, we have developed a network-based calculator using the best model to predict the risk of liver and/or lung metastasis in colorectal cancer patients. Results 51265 patients were enrolled in the study, of which 7864 (15.3%) had distant liver and/or lung metastasis. RF has the best predictive ability, In the internal test set, with an accuracy of 0.895, AUC of 0.956, and AUPR of 0.896. In addition, the RF model was evaluated in the external validation set with an accuracy of 0.913, AUC of 0.912, and AUPR of 0.611. Conclusion In this study, we constructed an RF algorithm mode to predict the risk of colorectal liver and/or lung metastasis, to assist doctors in making clinical decisions.
oncology,surgery
What problem does this paper attempt to address?