The role of differentially expressed genes and immune cell infiltration in the progression of nonalcoholic steatohepatitis (NASH) to hepatocellular carcinoma (HCC): a new exploration based on bioinformatics analysis

Yang Liu,Xiaohan Yu,Yuegu Wang,Jinge Wu,Bo Feng,Meng Li
DOI: https://doi.org/10.1080/15257770.2024.2310044
2024-02-08
Nucleosides Nucleotides & Nucleic Acids
Abstract:Nonalcoholic fatty liver disease (NAFLD) is a spectrum of chronic liver disease characterized. The condition ranges from isolated excessive hepatocyte triglyceride accumulation and steatosis (nonalcoholic fatty liver (NAFL), to hepatic triglyceride accumulation plus inflammation and hepatocyte injury (nonalcoholic steatohepatitis (NASH)) and finally to hepatic fibrosis and cirrhosis and/or hepatocellular carcinoma (HCC). However, the mechanism driving this process is not yet clear. Obtain sample microarray from the GEO database. Extract 6 healthy liver samples, 74 nonalcoholic hepatitis samples, 8 liver cirrhosis samples, and 53 liver cancer samples from the GSE164760 dataset. We used the GEO2R tool for differentially expressed genes (DEGs) analysis of disease progression (nonalcoholic hepatitis healthy group, cirrhosis nonalcoholic hepatitis group, and liver cancer cirrhosis group) and necroptosis gene set. Gene set variation analysis (GSVA) is used to evaluate the association between biological pathways and gene features. The STRING database and Cytoscape software were used to establish and visualize protein-protein interaction (PPI) networks and identify the key functional modules of DEGs, drawn factor-target genes regulatory network. Gene Ontology (GO) and KEGG pathway enrichment analyses of DEGs were also performed. Additionally, immune infiltration patterns were analyzed using the cibersort, and the correlation between immune cell-type abundance and DEGs expression was investigated. We further screened and obtained a total of 152 intersecting DEGs from three groups. 23 key genes were obtained through the MCODE plugin. Transcription factors regulating common differentially expressed genes were obtained in the hTFtarget database, and a TF target network diagram was drawn. There are 118 nodes, 251 edges, and 4 clusters in the PPI network. The key genes of the four modules include METAP2, RPL14, SERBP1, EEF2; HR4A1; CANX; ARID1A, UBE2K. METAP2, RPL14, SERBP1 and EEF2 was identified as the key hub genes. CREB1 was identified as the hub TF interacting with those gens by taking the intersection of potential TFs. The types of key gene changes were genetic mutations. It can be seen that the incidence of key gene mutations is 1.7% in EEF2, 0.8% in METAP2, and 0.3% in RPL14, respectively. Finally, We found that the most significant expression differences of the immune infiltrating cells among the three groups, were Tregs and M2, M0 type macrophages. We identified four hub genes METAP2, RPL14, SERBP1 and EEF2 being the most closely with the process from NASH to cirrhosis to HCC. It is beneficial to examine and understand the interaction between hub DEGs and potential regulatory molecules in the process. This knowledge may provide a novel theoretical foundation for the development of diagnostic biomarkers and gene-related therapy targets in the process.
biochemistry & molecular biology
What problem does this paper attempt to address?