Exploring the correlation between DNA methylation and biological age using an interpretable machine learning framework

Sheng Zhou,Jing Chen,Shanshan Wei,Chengxing Zhou,Die Wang,Xiaofan Yan,Xun He,Pengcheng Yan
DOI: https://doi.org/10.1038/s41598-024-75586-9
2024-10-15
Abstract:DNA methylation plays a significant role in regulating transcription and exhibits a systematic change with age. These changes can be used to predict an individual's age. First, to identify methylation sites associated with biological age; second, to construct a biological age prediction model and preliminarily explore the biological significance of methylation-associated genes using machine learning. A biological age prediction model was constructed using human methylation data through data preprocessing, feature selection procedures, statistical analysis, and machine learning techniques. Subsequently, 15 methylation data sets were subjected to in-depth analysis using SHAP, GO enrichment, and KEGG analysis. XGBoost, LightGBM, and CatBoost identified 15 groups of methylation sites associated with biological age. The cg23995914 locus was identified as the most significant contributor to predicting biological age by calculating SHAP values. Furthermore, GO enrichment and KEGG analyses were employed to initially explore the methylated loci's biological significance.
What problem does this paper attempt to address?