Machine learning predicts lymph node metastasis of poorly differentiated-type intramucosal gastric cancer

Cheng-Mao Zhou,Ying Wang,Hao-Tian Ye,Shuping Yan,Muhuo Ji,Panmiao Liu,Jian-Jun Yang
DOI: https://doi.org/10.1038/s41598-020-80582-w
IF: 4.6
2021-01-14
Scientific Reports
Abstract:Abstract To construct a machine learning algorithm model of lymph node metastasis (LNM) in patients with poorly differentiated-type intramucosal gastric cancer. 1169 patients with postoperative gastric cancer were divided into a training group and a test group at a ratio of 7:3. The model for lymph node metastasis was established with python machine learning. The Gbdt algorithm in the machine learning results finds that number of resected nodes, lymphovascular invasion and tumor size are the primary 3 factors that account for the weight of LNM. Effect of the LNM model of PDC gastric cancer patients in the training group: Among the 7 algorithm models, the highest accuracy rate was that of GBDT (0.955); The AUC values for the 7 algorithms were, from high to low, XGB (0.881), RF (0.802), GBDT (0.798), LR (0.778), XGB + LR (0.739), RF + LR (0.691) and GBDT + LR (0.626). Results of the LNM model of PDC gastric cancer patients in test group : Among the 7 algorithmic models, XGB had the highest accuracy rate (0.952); Among the 7 algorithms, the AUC values, from high to low, were GBDT (0.788), RF (0.765), XGB (0.762), LR (0.750), RF + LR (0.678), GBDT + LR (0.650) and XGB + LR (0.619). Single machine learning algorithm can predict LNM in poorly differentiated-type intramucosal gastric cancer, but fusion algorithm can not improve the effect of machine learning in predicting LNM.
multidisciplinary sciences
What problem does this paper attempt to address?
The main purpose of this paper is to construct a machine learning algorithm model to predict lymph node metastasis (LNM) in patients with poorly differentiated-type intramucosal gastric cancer (PDC). Researchers screened 1,169 cases from postoperative gastric cancer patients and divided these cases into training and testing groups in a 7:3 ratio. A prediction model for lymph node metastasis was established using Python machine learning. The study found that the number of lymph nodes removed, lymphovascular invasion, and tumor size were the three most important factors affecting lymph node metastasis in the machine learning results. Additionally, the study compared the performance of various machine learning algorithms (including Gradient Boosting Decision Tree (GBDT), Random Forest (RF), Extreme Gradient Boosting (XGB), and Logistic Regression (LR)) in predicting lymph node metastasis and concluded that a single machine learning algorithm can predict lymph node metastasis in patients with poorly differentiated-type intramucosal gastric cancer relatively well, but ensemble algorithms did not improve the prediction performance. In conclusion, this study provides a new method for predicting lymph node metastasis in early gastric cancer patients, which helps guide the selection of personalized treatment plans for these patients.