Abstract:Background: Distant metastasis of gastric cancer can seriously affect the treatment strategy of gastric cancer patients, so it is essential to identify patients at high risk of distant metastasis of gastric cancer earlier. Method: In this study, we retrospectively collected research data from 18,472 gastric cancer patients from the SEER database. We applied six machine learning algorithms to construct a model that can predict distant metastasis of gastric cancer. We constructed the machine learning model using 10-fold cross-validation. We evaluated the model using the area under the receiver operating characteristic curves (AUC), the area under the precision-recall curve (AUPRC), decision curve analysis, and calibration curves. In addition, we used Shapley's addition interpretation (SHAP) to interpret the machine learning model. We used data from 1595 gastric cancer patients in the First Hospital of Jilin University for external validation. We plotted the correlation heat maps of the predictor variables. We selected an optimal model and constructed a web-based online calculator for predicting the risk of distant metastasis of gastric cancer. Result: The study included 18,472 patients with gastric cancer from the SEER database, including 4,202 (22.75%) patients with distant metastases. The results of multivariate logistic regression analysis showed that age, race, grade of differentiation, tumor size, T stage, radiotherapy, and chemotherapy were independent risk factors for distant metastasis of gastric cancer. In the ten-fold cross-validation of the training set, the average AUC value of the random forest (RF) model was 0.80. The RF model performed best in the internal test set and external validation set. The RF model had an AUC of 0.80, an AUPRC of 0.555, an accuracy of 0.81, and a precision of 0.78 in the internal test set. The RF model had a metric AUC of 0.76 in the external validation set, an AUPRC of 0.496, an accuracy of 0.82, and a precision of 0.81. Finally, we constructed a network calculator for distant metastasis of gastric cancer using the RF model. Conclusion: With the help of pathological and clinical indicators, we constructed a well-performing RF model for predicting the risk of distant metastasis in gastric cancer patients to help clinicians make clinical decisions.

Application of routine test big data in early diagnosis of gastric cancer

Application of machine learning in the diagnosis of gastric cancer based on noninvasive characteristics

Predicting early gastric cancer risk using machine learning: A population-based retrospective study

Application of Machine Learning in the Diagnosis of Early Gastric Cancer Using the Kyoto Classification Score and Clinical Features Collected from Medical Consultations

Application of data mining methods to improve screening for the risk of early gastric cancer

Development of a Routine Serological Test Index Panel for the Surveillance of Gastric Cancer Risk in a High-Risk Population

Machine Learning: A Non-Invasive Prediction Method for Gastric Cancer Based on a Survey of Lifestyle Behaviors.

Development and validation of an artificial neural network model for non-invasive gastric cancer screening and diagnosis

Application of Machine Learning Algorithms to Predict Lymph Node Metastasis in Early Gastric Cancer

A feasibility study on utilizing machine learning technology to reduce the costs of gastric cancer screening in Taizhou, China

Performance evaluation of four prediction models for risk stratification in gastric cancer screening among a high-risk population in China

Application of support vector machine model for enhancing the diagnostic value of tumor markers in gastric cancer

Prediction algorithm for gastric cancer in a general population: A validation study

Construction and Validation of a Gastric Cancer Diagnostic Model based on Blood Groups and Tumor Markers

The value of machine learning approaches in the diagnosis of early gastric cancer: a systematic review and meta-analysis

Development of a deep learning model for early gastric cancer diagnosis using preoperative computed tomography images

Enhancing the diagnostic accuracy of colorectal cancer through the integration of serum tumor markers and hematological indicators with machine learning algorithms

Applying machine learning techniques to predict the risk of distant metastasis from gastric cancer: a real world retrospective study

Application of the Combined Detection of Pepsinogen, Gastrin-17,CEA and CA19-9 in the Diagnosis of Gastric Cancer

Five Common Tumor Biomarkers and CEA for Diagnosing Early Gastric Cancer: A Protocol for a Network Meta-Analysis of Diagnostic Test Accuracy

Early Screening of Colorectal Precancerous Lesions Based on Combined Measurement of Multiple Serum Tumor Markers Using Artificial Neural Network Analysis