Integrating Single-Cell and Bulk Expression Data to Identify and Analyze Cancer Prognosis-Related Genes

Shengbao Bao,Yaxin Fan,Yichao Mei,Junxiang Gao
DOI: https://doi.org/10.1016/j.heliyon.2024.e25640
IF: 3.776
2024-01-01
Heliyon
Abstract:Compared with traditional evaluation methods of cancer prognosis based on tissue samples, single-cell sequencing technology can provide information on cell type heterogeneity for predicting biomarkers related to cancer prognosis. Therefore, the bulk and single-cell expression profiles of breast cancer and normal cells were comprehensively analyzed to identify malignant and non-malignant markers and construct a reliable prognosis model. We first screened highly reliable differentially expressed genes from bulk expression profiles of multiple breast cancer tissues and normal tissues, and inferred genes related to cell malignancy from single-cell data. Then we identified eight critical genes related to breast cancer to conduct Cox regression analysis, calculate polygenic risk score (PRS), and verify the predictive ability of PRS in two data groups. The results show that PRS can divide breast cancer patients into high-risk group and low-risk group. PRS is related to the overall survival time and relapse-free interval and is a prognosis factor independent of conventional clinicopathological characteristics. Breast cancer is usually regarded as a cancer with a relatively good prognosis. In order to further explore whether this workflow can be applied to cancer with poor prognosis, we selected lung cancer for a comparative study. The results show that this workflow can also build a reasonable prognosis model for lung cancer. This study provides new insight and practical source code for further research on cancer biomarkers and drug targets. It also provides basis for survival prediction, treatment response prediction, and personalized treatment.
What problem does this paper attempt to address?