Machine Learning-Based Model for the Prognosis of Postoperative Gastric Cancer
Donghui Liu,Xuyao Wang,Long Li,Qingxin Jiang,Xiaoxue Li,Menglin Liu,Wenxin Wang,Enhong Shi,Chenyao Zhang,Yinghui Wang,Yan Zhang,Liru Wang
DOI: https://doi.org/10.2147/CMAR.S342352
2022-01-07
Cancer Management and Research
Abstract:Donghui Liu, 1, 2 Xuyao Wang, 3 Long Li, 4 Qingxin Jiang, 5 Xiaoxue Li, 2 Menglin Liu, 2 Wenxin Wang, 2 Enhong Shi, 2 Chenyao Zhang, 2 Yinghui Wang, 2 Yan Zhang, 1, &ast Liru Wang 1, 2, &ast 1 School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang Province, People's Republic of China; 2 Department of Oncology, Heilongjiang Provincial Hospital, Harbin, Heilongjiang Province, People's Republic of China; 3 Department of Pharmacy, Harbin Second Hospital, Harbin, Heilongjiang Province, People's Republic of China; 4 Department of General Surgery, First Affiliated Hospital of Harbin Medical University, Harbin, Heilongjiang Province, People's Republic of China; 5 Department of General Surgery, Harbin 242 Hospital of Genertec Medical, Harbin, Heilongjiang Province, People's Republic of China &astThese authors contributed equally to this work Correspondence: Yan Zhang School of Life Science and Technology, Harbin Institute of Technology, No. 92 Xidazhi Street, Nangang District, Harbin, Heilongjiang, People's Republic of China Tel +86 13936253249 Email Liru Wang Department of Oncology, Heilongjiang Provincial Hospital, No. 82 Zhongshan Road, Xiangfang District, Harbin, Heilongjiang, People's Republic of China , Tel +86 13633609001 Email Background: The use of machine learning (ML) in predicting disease prognosis has increased, and scientists have adopted different methods for cancer classification to optimize the early screening of cancer to determine its prognosis in advance. In this study, we aimed at improving the prediction accuracy of gastric cancer in postoperation patients by constructing a highly effective prognostic model. Methods: The study used postoperative gastric cancer patient data from the SEER database. The LASSO regression method was used to construct a clinical prognostic model, and four machine learning methods (Boruta algorithm, neural network, support vector machine, and random forest) were used to screen and recombine the features to construct an ML prognostic model. Clinical information on 955 postoperative gastric cancer patients collected from the Affiliated Tumor Hospital of Harbin Medical University was used for external verification. Results: Experimental results showed that the AUC values of 1, 3 and 5 years in the training set, validation set and external validation set of clinical prognosis model and ML prognosis model directly established by LASSO regression are all around 0.8. Conclusion: Both models can accurately evaluate the prognosis of postoperative patients with gastric cancer, which may be helpful for accurate and personalized treatment of postoperative patients with gastric cancer. Keywords: machine learning, gastric cancer, prognosis, Boruta, ElasticNet, SVM, random forest According to the global cancer statistics released by the World Health Organization in 2018, the incidence and mortality rate of gastric cancer (GC) ranked fifth and third respectively. It is common in East Asia, with an incidence rate of 32.1/100,000 people and a mortality rate of 13.2/100,000 people; 1 therefore, the prevention and treatment of gastric cancer should arouse our great attention. In fact, the treatment of tumors mainly depends on the judgment of prognosis prediction. Accurately predicting the prognosis of different individuals is of immense significance for patients with gastric cancer to choose appropriate treatment strategies. Surgery, as the main treatment for gastric cancer, is considered to be the only possible cure method. Although the level of surgery has continuously improved in recent years, the overall prognosis is poor. 2,3 There are many influencing factors, such as gender, age, Eastern Cooperative Oncology Group (ECOG) score, tumor location, tumor size, degree of differentiation, tumor grade, tissue typing, TNM staging, and chemotherapy, 4–6 among which TNM staging is widely used in clinical work. Unfortunately, TNM staging alone cannot accurately predict the overall postoperative prognosis of patients; 7 therefore, it is very important to establish a reliable model to predict the prognosis of high-risk patients and formulate individualized treatment strategies. In recent years, scientists have adopted different methods to optimize the early screening of cancer to determine the prognosis in advance through the classification of cancer and, at the same time, develop new targeted cancer treatment strategies. Therefore, machine learning (ML) methods have become an -Abstract Truncated-
oncology