Integrative analysis reveals a four-gene signature for predicting survival and immunotherapy response in colon cancer patients using bulk and single-cell RNA-seq data

Ruoyang Chai,Yajie Zhao,Zhengjia Su,Wei Liang
DOI: https://doi.org/10.3389/fonc.2023.1277084
2023-10-31
Abstract:Background: Colon cancer (CC) ranks as one of the leading causes of cancer-related mortality globally. Single-cell transcriptome sequencing (scRNA-seq) offers precise gene expression data for distinct cell types. This study aimed to utilize scRNA-seq and bulk transcriptome sequencing (bulk RNA-seq) data from CC samples to develop a novel prognostic model. Methods: scRNA-seq data was downloaded from the GSE161277 database. R packages including "Seurat", "Harmony", and "singleR" were employed to categorize eight major cell types within normal and tumor tissues. By comparing tumor and normal samples, differentially expressed genes (DEGs) across these major cell types were identified. Gene Ontology (GO) enrichment analyses of DEGs for each cell type were conducted using "Metascape". DEGs-based signature construction involved Cox regression and least absolute shrinkage operator (LASSO) analyses, performed on The Cancer Genome Atlas (TCGA) training cohort. Validation occurred in the GSE39582 and GSE33382 datasets. The expression pattern of prognostic genes was verified using spatial transcriptome sequencing (ST-seq) data. Ultimately, an established prognostic nomogram based on the gene signature and age was established and calibrated. Sensitivity to chemotherapeutic drugs was predicted with the "oncoPredict" R package. Results: Using scRNA-Seq data, we examined 33,213 cells, categorizing them into eight cell types within normal and tumor samples. GO enrichment analysis revealed various cancer-related pathways across DEGs in these cell types. Among the 55 DEGs identified via univariate Cox regression, four independent prognostic genes emerged: PTPN6, CXCL13, SPINK4, and NPDC1. Expression validation through ST-seq confirmed PTPN6 and CXCL13 predominance in immune cells, while SPINK4 and NPDC1 were relatively epithelial cell-specific. Creating a four-gene prognostic signature, Kaplan-Meier survival analyses emphasized higher risk scores correlating with unfavorable prognoses, confirmed across training and validation cohorts. The risk score emerged as an independent prognostic factor, supported by a reliable nomogram. Intriguingly, drug sensitivity analysis unveiled contrasting anti-cancer drug responses in the two risk groups, suggesting significant clinical implications. Conclusion: We developed a novel prognostic four-gene risk model, and these genes may act as potential therapeutic targets for CC.
What problem does this paper attempt to address?