iResNetDM: An interpretable deep learning approach for four types of DNA methylation modification prediction

Zerui Yang,Wei Shao,Yudai Matsuda,Linqi Song
DOI: https://doi.org/10.1016/j.csbj.2024.11.006
IF: 6.155
2024-11-16
Computational and Structural Biotechnology Journal
Abstract:Motivation Although several computational methods for predicting DNA methylation modifications have been developed, two main limitations persist: 1) All of the models are currently confined to binary predictors, which merely determine the presence or absence of DNA methylation modifications and thus prevent comprehensive analyses of the interrelations among varied modification types. Multi-class classification models for RNA modifications have been developed, and a comparable approach for DNA is essential. 2) Few previous studies offer adequate explanations of how models make decisions, instead relying on the extraction and visualization of attention matrices, which have identified few motifs and do not provide sufficient insights into the model decision-making process. Result In this study, we introduce the task of DNA methylation modification prediction as a multi-class classification problem for the first time. We present iResNetDM, a deep learning model that integrates Residual Networks (ResNet) with self-attention mechanisms. To the best of our knowledge, iResNetDM is the first model capable of distinguishing between four types of DNA methylation modifications. Our model not only demonstrates good performance across various DNA methylation modifications but can also capture relationships between different types of modifications. We used the integrated gradients technique to enhance the interpretability of the iResNetDM. This method can effectively elucidate the model's decision-making process, thus enabling the successful identification of multiple motifs. Notably, our model displays remarkable robustness, and can effectively identify unique motifs across different methylation modifications. We also compared the motifs discovered in various modifications and found that some had notable sequence similarities, suggesting that they may be subject to different types of modifications. This finding highlights the potential importance of these motifs in gene regulation.
biochemistry & molecular biology
What problem does this paper attempt to address?