Integrating single-cell and bulk sequencing data to identify glycosylation-based genes in non-alcoholic fatty liver disease-associated hepatocellular carcinoma

Zhijia Zhou,Yanan Gao,Longxin Deng,Xiaole Lu,Yancheng Lai,Jieke Wu,Shaodong Chen,Chengzhong Li,Huiqing Liang
DOI: https://doi.org/10.7717/peerj.17002
IF: 3.061
2024-03-18
PeerJ
Abstract:Background The incidence of non-alcoholic fatty liver disease (NAFLD) associated hepatocellular carcinoma (HCC) has been increasing. However, the role of glycosylation, an important modification that alters cellular differentiation and immune regulation, in the progression of NAFLD to HCC is rare. Methods We used the NAFLD-HCC single-cell dataset to identify variation in the expression of glycosylation patterns between different cells and used the HCC bulk dataset to establish a link between these variations and the prognosis of HCC patients. Then, machine learning algorithms were used to identify those glycosylation-related signatures with prognostic significance and to construct a model for predicting the prognosis of HCC patients. Moreover, it was validated in high-fat diet-induced mice and clinical cohorts. Results The NAFLD-HCC Glycogene Risk Model (NHGRM) signature included the following genes: SPP1, SOCS2, SAPCD2, S100A9, RAMP3, and CSAD. The higher NHGRM scores were associated with a poorer prognosis, stronger immune-related features, immune cell infiltration and immunity scores. Animal experiments, external and clinical cohorts confirmed the expression of these genes. Conclusion The genetic signature we identified may serve as a potential indicator of survival in patients with NAFLD-HCC and provide new perspectives for elucidating the role of glycosylation-related signatures in this pathologic process.
multidisciplinary sciences
What problem does this paper attempt to address?