Predicting locus-specific DNA methylation levels in cancer and paracancer tissues

Shuzheng Zhang,Baoshan Ma,Yu Liu,Yiwen Shen,Di Li,Shuxin Liu,Fengju Song
DOI: https://doi.org/10.2217/epi-2023-0114
2024-03-14
Epigenomics
Abstract:Aim: To predict base-resolution DNA methylation in cancerous and paracancerous tissues. Material & methods: We collected six cancer DNA methylation datasets from The Cancer Genome Atlas and five cancer datasets from Gene Expression Omnibus and established machine learning models using paired cancerous and paracancerous tissues. Tenfold cross-validation and independent validation were performed to demonstrate the effectiveness of the proposed method. Results: The developed cross-tissue prediction models can substantially increase the accuracy at more than 68% of CpG sites and contribute to enhancing the statistical power of differential methylation analyses. An XGBoost model leveraging multiple correlating CpGs may elevate the prediction accuracy. Conclusion: This study provides a powerful tool for DNA methylation analysis and has the potential to gain new insights into cancer research from epigenetics.
genetics & heredity
What problem does this paper attempt to address?