Abstract:BACKGROUND:Single-cell sequencing technology provides the capability to analyze changes in specific cell types during the progression of disease. However, previous single-cell sequencing studies on gastric cancer (GC) have largely focused on immune cells and stromal cells, and further elucidation is required regarding the alterations that occur in gastric epithelial cells during the development of GC.AIM:To create a GC prediction model based on single-cell and bulk RNA sequencing (bulk RNA-seq) data.METHODS:In this study, we conducted a comprehensive analysis by integrating three single-cell RNA sequencing (scRNA-seq) datasets and ten bulk RNA-seq datasets. Our analysis mainly focused on determining cell proportions and identifying differentially expressed genes (DEGs). Specifically, we performed differential expression analysis among epithelial cells in GC tissues and normal gastric tissues (NAGs) and utilized both single-cell and bulk RNA-seq data to establish a prediction model for GC. We further validated the accuracy of the GC prediction model in bulk RNA-seq data. We also used Kaplan-Meier plots to verify the correlation between genes in the prediction model and the prognosis of GC.RESULTS:By analyzing scRNA-seq data from a total of 70707 cells from GC tissue, NAG, and chronic gastric tissue, 10 cell types were identified, and DEGs in GC and normal epithelial cells were screened. After determining the DEGs in GC and normal gastric samples identified by bulk RNA-seq data, a GC predictive classifier was constructed using the Least absolute shrinkage and selection operator (LASSO) and random forest methods. The LASSO classifier showed good performance in both validation and model verification using The Cancer Genome Atlas and Genotype-Tissue Expression (GTEx) datasets [area under the curve (AUC)_min = 0.988, AUC_1se = 0.994], and the random forest model also achieved good results with the validation set (AUC = 0.92). Genes TIMP1, PLOD3, CKS2, TYMP, TNFRSF10B, CPNE1, GDF15, BCAP31, and CLDN7 were identified to have high importance values in multiple GC predictive models, and KM-PLOTTER analysis showed their relevance to GC prognosis, suggesting their potential for use in GC diagnosis and treatment.CONCLUSION:A predictive classifier was established based on the analysis of RNA-seq data, and the genes in it are expected to serve as auxiliary markers in the clinical diagnosis of GC.

Highly accurate two-gene signature for gastric cancer

Identification of gene signatures used to recognize biological characteristics of gastric cancer upon gene expression data.

Identification of potential biomarkers with colorectal cancer based on bioinformatics analysis and machine learning

Identification of Significant Biomarkers and Pathways Associated with Gastric Carcinogenesis by Whole Genome-Wide Expression Profiling Analysis

A transcriptomic study for identifying cardia- and non-cardia-specific gastric cancer prognostic factors using genetic algorithm-based methods

Identification of Tumor Mutation Burden, Microsatellite Instability, and Somatic Copy Number Alteration Derived Nine Gene Signatures to Predict Clinical Outcomes in STAD.

Identification of the diagnostic genes and immune cell infiltration characteristics of gastric cancer using bioinformatics analysis and machine learning

A novel strategy to identify candidate diagnostic and prognostic biomarkers for gastric cancer

Identification of a Nine-Gene Prognostic Signature for Gastric Carcinoma Using Integrated Bioinformatics Analyses

Identification and validation of a prognostic 9-genes expression signature for gastric cancer.

GeneExpressScore Signature: a Robust Prognostic and Predictive Classifier in Gastric Cancer

Identifying Novel Cell Glycolysis-Related Gene Signature Predictive of Overall Survival in Gastric Cancer

Selection of Gastric Cancer Subgroups Marker Genes Based on Machine Learning Methods

The prediction of survival in Gastric Cancer based on a Robust 13-Gene Signature

Integrated Bioinformatics Analysis Reveals Novel Key Biomarkers and Potential Candidate Small Molecule Drugs in Gastric Cancer.

Establishment of a prognostic model of four genes in gastric cancer based on multiple data sets

Establishing a Cancer Driver Gene Signature-Based Risk Model for Predicting the Prognoses of Gastric Cancer Patients.

Identification of Potential Key Genes in Gastric Cancer Using Bioinformatics Analysis

Integrated Analysis of Single-Cell and Bulk RNA-seq Establishes a Novel Signature for Prediction in Gastric Cancer.

Identification of a 3-Gene Model as Prognostic Biomarker in Patients With Gastric Cancer

Screening and Identification of Key Biomarkers of Gastric Cancer: Three Genes Jointly Predict Gastric Cancer