Iterative feature representation algorithm to improve the predictive performance of N7-methylguanosine sites
Chichi Dai,Pengmian Feng,Lizhen Cui,Ran Su,Wei Chen,Leyi Wei
DOI: https://doi.org/10.1093/bib/bbaa278
IF: 9.5
2020-11-10
Briefings in Bioinformatics
Abstract:Abstract Motivation N7-methylguanosine (m7G) is an important epigenetic modification, playing an essential role in gene expression regulation. Therefore, accurate identification of m7G modifications will facilitate revealing and in-depth understanding their potential functional mechanisms. Although high-throughput experimental methods are capable of precisely locating m7G sites, they are still cost ineffective. Therefore, it’s necessary to develop new methods to identify m7G sites. Results In this work, by using the iterative feature representation algorithm, we developed a machine learning based method, namely m7G-IFL, to identify m7G sites. To demonstrate its superiority, m7G-IFL was evaluated and compared with existing predictors. The results demonstrate that our predictor outperforms existing predictors in terms of accuracy for identifying m7G sites. By analyzing and comparing the features used in the predictors, we found that the positive and negative samples in our feature space were more separated than in existing feature space. This result demonstrates that our features extracted more discriminative information via the iterative feature learning process, and thus contributed to the predictive performance improvement.
biochemical research methods,mathematical & computational biology