Bioinformatics and machine learning algorithm were used to screen key expressed genes and analyze related immune cell infiltration in prostate cancer
高文治,何宇辉,朱振鹏,张家锋,巩艳青,何世明,周利群,郭跃先,李学松
DOI: https://doi.org/10.3760/cma.j.cn421213-20210509-01149
2022-01-01
Abstract:Objective:Through bioinformatics and machine science algorithm to screen and analyze the key expression genes of prostate cancer, explore the biomarkers for the diagnosis of prostate cancer and the correlation with immune cell infiltration of prostate cancer.Methods:Three prostate cancer tissue mRNA microarray datasets were downloaded from the gene expression profile (GEO) database by bioinformatics methods: GSE46602 and GSE69334 were used as training sets, and GSE32571 as validation sets. Differential expression genes (DEGs) were obtained by combining data sets GSE46602 and GSE69223. Kyoto Encyclopedia of Genes and Genomes (KEGG), Gene Ontology (GO), disease enrichment analysis (DO) and gene enrichment analysis (GSEA) were used for functional enrichment analysis. Lasso gene 11 regression filter characteristics, support vector machine (SVM) gene 2 filter characteristics, characteristics of intersection for two gene characteristics of intersection for two gene hepsin (HPN), keratin23 (KRT23), the two genes in the data set GSE32571 verification, At the same time, real-time fluorescence quantitative polymerase chain reaction was carried out to verify the relationship between two characteristic genes and immune cell infiltration in prostate cancer-related cell lines.Results:A total of 35 DEGs and two core genes were found by using R language and machine learning methods in three prostate cancer datasets of GEO database, including 20 down-regulated genes and 15 up-regulated genes. Analysis of GO, KEGG, DO and GSEA pathways revealed that these genes were enriched in epidermal cell differentiation, keratinosis and other functions, and in extracellular matrix receptor interaction and estrogen receptor pathway. The characteristic genes screened by least absolute shrinkage and selection operator (LASSO) and support vector machuines (SVM) and the data set GSE32571 were tested. It was found that HPN and KRT23 were two diagnostic biomarkers of prostate cancer, and the mRNA level in the prostatic adenocarcinoma cell line was verified in line with the results of bioinformatics analysis. The expression levels of HPN in Du145, PC3, Vcap, Lncap, C4-2 and 22RV1 groups (1.10±0.29, 0.46±0.12, 3.02±0.79, 1.58±0.09, 0.39±0.02, 0.41±0.07) was higher than RWPE1 group (0.09±0.01). The difference was statistically significant (
t=6.000, 5.030, 6.400, 27.980, 15.600, 6.870,
P<0.05). The expression of KRT23 in Du145, PC3, Vcap, Lncap, C4-2 and 22RV1 groups (0.42±0.01, 0.15±0.03, 0.15±0.02, 0.15±0.03, 0.62±0.09, 0.04±0.01) was lower than that in RWPE1 group (1.01±0.19). The difference was statistically significant (
t=5.210, 7.600, 7.620, 7.580, 3.120, 8.630,
P<0.05), HPN and KRT23 were correlated with immune cells, HPN was negatively correlated with T cell CD8, resting mast cell and resting dendritic cell, and positively correlated with macrophage M0. KRT23 was negatively correlated with macrophage M0 and positively correlated with resting dendritic cells and resting mast cells.
Conclusion:HPN and KRT23 can be used as diagnostic biomarkers of prostate cancer, and HPN and KRT23 are related to immune cells such as follicular helper T cells and regulatory T cells.