Feature Selection and Classification Technique for Predicting Lymph Node Metastasis of Papillary Thyroid Carcinoma

Dan Wu,Yan Zhuang,Guoliang Liao,Lin Han,Ke Chen,Cheng Li,Zhan Hua,Jiangli Lin
DOI: https://doi.org/10.1142/s2737416524400064
2024-07-03
Journal of Computational Biophysics and Chemistry
Abstract:Papillary thyroid carcinoma (PTC) is typically an indolent cancer, yet a minority of cases develop lymph node metastasis. Due to the unclear mechanisms of lymph node metastasis, a considerable number of patients undergo unnecessary surgeries. Currently, the identification of key genetic biomarkers in high-dimensional data presents a significant challenge, thereby limiting research progress in this area. Here, we proposed a hybrid filter-wrapper feature selection strategy for core factor detection and developed MethyAE, a metastasis prediction model based on DNA methylation, utilizing an end-to-end learning auto-encoder. 46 methylated CpG sites were successfully identified as crucial biomarkers for lymph node metastasis. Leveraging 447 PTC samples from the Cancer Genome Atlas (221 with metastasis, 226 without), the MethyAE model achieves 88.9% accuracy and a recall rate of 88.6% in predicting lymph node metastasis, outperforming commonly used machine learning methods like logistic regression and random forest. Furthermore, the MethyAE model exhibits favorable performance in DNA methylation data from colon cancer, bladder cancer, and breast cancer. To the best of our knowledge, this is the first attempt to predict PTC lymph node metastasis through DNA methylation, offering pivotal decision-making criteria for avoiding unnecessary surgeries and selecting appropriate treatment plans for a substantial cohort of PTC patients.
What problem does this paper attempt to address?