Identification of Key Genes and Pathways by Bioinformatics Analysis with TCGA RNA Sequencing Data in Hepatocellular Carcinoma

Qiandong Zhu,Yunpeng Sun,Qingqing Zhou,Qikuan He,Haixin Qian
DOI: https://doi.org/10.3892/mco.2018.1728
2018-01-01
Molecular and Clinical Oncology
Abstract:Improved insight into the molecular characteristics of hepatocellular carcinoma (HCC) is required to predict prognosis and to develop a new rationale for targeted therapeutic strategy. Bioinformatics methods, including functional enrichment and network analysis combined with survival analysis, are required to process a large volume of data to obtain further information on differentially expressed genes (DEGs). The RNA sequencing data related to HCC in The Cancer Genome Atlas (TCGA) database were analyzed to screen DEGs, which were separately submitted to perform gene enrichment analysis to identify gene sets and signaling pathways, and to construct a protein-protein interaction (PPI) network. Subsequently, hub genes were selected by the core level in the network, and the top hub genes were focused on gene expression analysis and survival analysis. A total of 610 DEGs were identified, including 444 upregulated and 166 downregulated genes. The upregulated DEGs were significantly enriched in the Gene Ontology analysis (GO): Cell division and in the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway: Cell cycle, whereas the downregulated DEGs were enriched in GO: Negative regulation of growth and in the KEGG pathway: Retinol metabolism, with significant differences. Cyclin-dependent kinase (CDK)1 was selected as the top hub gene by the PPI network, which exhibited a similar expression trend with the data from the Gene Expression Omnibus (GEO) database. Survival analysis revealed a significantly negative correlation between CDK1 expression level and overall survival in the TCGA group (P<0.01) and the GEO group (P<0.01). Therefore, high-throughput TCGA data analysis appears to be an effective method for screening tumor molecular markers, and high expression of CDK1 is a prognostic factor for HCC.
What problem does this paper attempt to address?