Use machine learning to predict pulmonary metastasis of esophageal cancer: a population-based study

Ying Fang,Jun Wan,Yukai Zeng
DOI: https://doi.org/10.1007/s00432-024-05937-6
2024-09-16
Abstract:Background: This study aims to establish a predictive model for assessing the risk of esophageal cancer lung metastasis using machine learning techniques. Methods: Data on esophageal cancer patients from 2010 to 2020 were extracted from the surveillance, epidemiology, and end results (SEER) database. Through univariate and multivariate logistic regression analyses, eight indicators related to the risk of lung metastasis were selected. These indicators were incorporated into six machine learning classifiers to develop corresponding predictive models. The performance of these models was evaluated and compared using metrics such as The area under curve (AUC), accuracy, sensitivity, specificity, and F1 score. Results: A total of 20,249 confirmed cases of esophageal cancer were included in this study. Among them, 14,174 cases (70%) were assigned to the training set while 6075 cases (30%) constituted the internal test set. Primary site location, tumor histology, tumor grade classification system T staging criteria N staging criteria brain metastasis bone metastasis liver metastasis emerged as independent risk factors for esophageal cancer with lung metastasis. Amongst the six constructed models, the GBM algorithm-based machine learning model demonstrated superior performance during internal dataset validation. AUC, accuracy, sensitivity, and specificity values achieved by this model stood at respectively at 0.803, 0.849, 0.604, and 0.867. Conclusion: We have developed an online calculator based on the GBM model ( https://lvgrkyxcgdvo7ugoyxyywe.streamlit.app/)to aid clinical decision-making and treatment planning.
What problem does this paper attempt to address?