Integration of machine learning for developing a prognostic signature related to programmed cell death in colorectal cancer
Qi‐Tong Xu,Jian‐Kun Qiang,Zhi‐Ye Huang,Wan‐Ju Jiang,Xi‐Mao Cui,Ren‐Hao Hu,Tao Wang,Xiang‐Lan Yi,Jia‐Yuan Li,Zuoren Yu,Shun Zhang,Tao Du,Jinhui Liu,Xiao‐Hua Jiang
DOI: https://doi.org/10.1002/tox.24157
IF: 4.109
2024-02-03
Environmental Toxicology
Abstract:Background Colorectal cancer (CRC) presents a significant global health burden, characterized by a heterogeneous molecular landscape and various genetic and epigenetic alterations. Programmed cell death (PCD) plays a critical role in CRC, offering potential targets for therapy by regulating cell elimination processes that can suppress tumor growth or trigger cancer cell resistance. Understanding the complex interplay between PCD mechanisms and CRC pathogenesis is crucial. This study aims to construct a PCD‐related prognostic signature in CRC using machine learning integration, enhancing the precision of CRC prognosis prediction. Method We retrieved expression data and clinical information from the Cancer Genome Atlas and Gene Expression Omnibus (GEO) datasets. Fifteen forms of PCD were identified, and corresponding gene sets were compiled. Machine learning algorithms, including Lasso, Ridge, Enet, StepCox, survivalSVM, CoxBoost, SuperPC, plsRcox, random survival forest (RSF), and gradient boosting machine, were integrated for model construction. The models were validated using six GEO datasets, and the programmed cell death score (PCDS) was established. Further, the model's effectiveness was compared with 109 transcriptome‐based CRC prognostic models. Result Our integrated model successfully identified differentially expressed PCD‐related genes and stratified CRC samples into four subtypes with distinct prognostic implications. The optimal combination of machine learning models, RSF + Ridge, showed superior performance compared with traditional methods. The PCDS effectively stratified patients into high‐risk and low‐risk groups, with significant survival differences. Further analysis revealed the prognostic relevance of immune cell types and pathways associated with CRC subtypes. The model also identified hub genes and drug sensitivities relevant to CRC prognosis. Conclusion The current study highlights the potential of integrating machine learning models to enhance the prediction of CRC prognosis. The developed prognostic signature, which is related to PCD, holds promise for personalized and effective therapeutic interventions in CRC.
toxicology,environmental sciences,water resources