Weighted Fuzzy System for Identifying DNA N4-Methylcytosine Sites With Kernel Entropy Component Analysis

Leyao Wang,Prayag Tiwari,Yijie Ding,Fei Guo.
DOI: https://doi.org/10.1109/tai.2023.3266191
2023-01-01
IEEE Transactions on Artificial Intelligence
Abstract:N4-methylcytosine (4mC) is a common DNA methylation that has been implicated in epigenetic regulation and host defense. Accurate prediction of 4mC sites in DNA sequences will help to better explore the biological processes and mechanisms. For this problem, computational methods based on machine learning (ML) and deep learning (DL) are faster, less complex, and less expensive than experimental detection methods. However, the existing computational methods are still unsatisfactory in terms of prediction accuracy, so we propose a new method with better performance. In this work, we propose a weighted fuzzy system for identifying DNA 4mC sites by kernel entropy component analysis (KECA). We named it as W-TSK-FS-KECA. This model is improved based on the Takagi-Sugeuo-Kang fuzzy system (TSK-FS). We use position-specific trinucleotide propensity (PSTNP) to construct feature vectors on representative benchmark datasets. Then we use KECA to get the reconstruct error. Finally, we put the calculated reconstruction error add to the regular term of TSK-FS as the weights to enhance the model performance. Comparative experiments with other methods show that it has good classification performance.
What problem does this paper attempt to address?