Fuzzy Neural Tangent Kernel Model for Identifying DNA N4-methylcytosine Sites

Yijie Ding,Prayag Tiwari,Fei Guo,Quan Zou,Weiping Ding
DOI: https://doi.org/10.1109/tfuzz.2024.3425616
2024-01-01
Abstract:DNA N4-methylcytosine (4mC) site identification is a crucial field in bioinformatics, where machine learning methods have been effectively utilized. Due to the presence of noise, the existing deep learning methods for detecting 4mC have consistently low recognition rates in positive samples. With fuzzy rules and membership functions, fuzzy systems can achieve good results in processing noisy signals. In contrast to traditional fuzzy systems that lack deep feature representation and sample measurement, we introduce novel techniques to enhance generalization and feature representation. By incorporating the neural tangent kernel (NTK) and kernel learning algorithm into the fuzzy system, we propose the fuzzy NTK (FNTK) model and the radius-based FNTK (R-FNTK) model to predict DNA 4mC sites. To achieve better generalization performance than traditional kernel functions, we first train the NTK for feature representation learning and sample measurement. Based on the membership function and NTK matrix, different fuzzy kernel matrices are constructed for each fuzzy subset of the fuzzy system. Finally, we utilize two types of iterative kernel optimization algorithms to effectively fuse multiple NTK-based fuzzy kernels and obtain the final prediction model. Rigorous testing using six benchmark datasets demonstrates the superiority of our approach, yielding significant improvements in the experiment's performance.
What problem does this paper attempt to address?