Identification and validation of a novel prognostic signature based on transcription factors in breast cancer by bioinformatics analysis

Yingmei Yang,Zhaoyun Li,Qianyi Zhong,Lei Zhao,Yichao Wang,Hongbo Chi
DOI: https://doi.org/10.21037/gs-22-267
Abstract:Background: Breast cancer (BRCA) is the leading cause of cancer mortality among women, and it is associated with many tumor suppressors and oncogenes. There is increasing evidence that transcription factors (TFs) play vital roles in human malignancies, but TFs-based biomarkers for BRCA prognosis were still rare and necessary. This study sought to develop and validate a prognostic model based on TFs for BRCA patients. Methods: Differentially expressed TFs were screened from 1,109 BRCA and 113 non-tumor samples downloaded from The Cancer Genome Atlas (TCGA). Univariate Cox regression analysis was used to identify TFs associated with overall survival (OS) of BRCA, and multivariate Cox regression analysis was performed to establish the optimal risk model. The predictive value of the TF model was established using TCGA database and validated using a Gene Expression Omnibus (GEO) data set (GSE20685). A gene set enrichment analysis was conducted to identify the enriched signaling pathways in high-risk and low-risk BRCA patients. Gene Ontology (GO) function and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses of the TF target genes were also conducted separately. Results: A total of 394 differentially expressed TFs were screened. A 9-TF prognostic model, comprising PAX7, POU3F2, ZIC2, WT1, ALX4, FOXJ1, SPIB, LEF1 and NFE2, was constructed and validated. Compared to those in the low-risk group, patients in the high-risk group had worse clinical outcomes (P<0.001). The areas under the curve of the prognostic model for 5-year OS were 0.722 in the training cohort and 0.651 in the testing cohort. Additionally, the risk score was an independent prediction indicator for BRCA patients both in the training cohort (HR =1.757, P<0.001) and testing cohort (HR =1.401, P=0.001). It was associated with various cancer signaling pathways. Ultimately, 9 overlapping target genes were predicted by 3 prediction nomograms. The GO and KEGG enrichment analyses of these target genes suggested that the TFs in the model may regulate the activation of some classical tumor signaling pathways to control the progression of BRCA through these target genes. Conclusions: Our study developed and validated a novel prognostic TF model that can effectively predict 5-year OS for BRCA patients.
What problem does this paper attempt to address?