Based on machine learning, CDC20 has been identified as a biomarker for postoperative recurrence and progression in stage I & II lung adenocarcinoma patients

Rui Miao,Zhi Xu,Tao Han,Yafeng Liu,Jiawei Zhou,Jianqiang Guo,Yingru Xing,Ying Bai,Zhonglei He,Jing Wu,Wenxin Wang,Dong Hu
DOI: https://doi.org/10.3389/fonc.2024.1351393
2024-07-24
Abstract:Objective: By utilizing machine learning, we can identify genes that are associated with recurrence, invasion, and tumor stemness, thus uncovering new therapeutic targets. Methods: To begin, we obtained a gene set related to recurrence and invasion from the GEO database, a comprehensive gene expression database. We then employed the Weighted Gene Co-expression Network Analysis (WGCNA) to identify core gene modules and perform functional enrichment analysis on them. Next, we utilized the random forest and random survival forest algorithms to calculate the genes within the key modules, resulting in the identification of three crucial genes. Subsequently, one of these key genes was selected for prognosis analysis and potential drug screening using the Kaplan-Meier tool. Finally, in order to examine the role of CDC20 in lung adenocarcinoma (LUAD), we conducted a variety of in vitro and in vivo experiments, including wound healing assay, colony formation assays, Transwell migration assays, flow cytometric cell cycle analysis, western blotting, and a mouse tumor model experiment. Results: First, we collected a total of 279 samples from two datasets, GSE166722 and GSE31210, to identify 91 differentially expressed genes associated with recurrence, invasion, and stemness in lung adenocarcinoma. Functional enrichment analysis revealed that these key gene clusters were primarily involved in microtubule binding, spindle, chromosomal region, organelle fission, and nuclear division. Next, using machine learning, we identified and validated three hub genes (CDC45, CDC20, TPX2), with CDC20 showing the highest correlation with tumor stemness and limited previous research. Furthermore, we found a close association between CDC20 and clinical pathological features, poor overall survival (OS), progression-free interval (PFI), progression-free survival (PFS), and adverse prognosis in lung adenocarcinoma patients. Lastly, our functional research demonstrated that knocking down CDC20 could inhibit cancer cell migration, invasion, proliferation, cell cycle progression, and tumor growth possibly through the MAPK signaling pathway. Conclusion: CDC20 has emerged as a novel biomarker for monitoring treatment response, recurrence, and disease progression in patients with lung adenocarcinoma. Due to its significance, further research studying CDC20 as a potential therapeutic target is warranted. Investigating the role of CDC20 could lead to valuable insights for developing new treatments and improving patient outcomes.
What problem does this paper attempt to address?