Computational Identification of N6-methyladenosine Sites in Multiple Tissues of Mammals.

Fu-Ying Dao,Hao Lv,Yu-He Yang,Hasan Zulfiqar,Hui Gao,Hao Lin
DOI: https://doi.org/10.1016/j.csbj.2020.04.015
IF: 6.155
2020-01-01
Computational and Structural Biotechnology Journal
Abstract:N6-methyladenosine (m6A) is the methylation of the adenosine at the nitrogen-6 position, which is the most abundant RNA methylation modification and involves a series of important biological processes. Accurate identification of m6A sites in genome-wide is invaluable for better understanding their biological functions. In this work, an ensemble predictor named iRNA-m6A was established to identify m6A sites in multiple tissues of human, mouse and rat based on the data from high-throughput sequencing techniques. In the proposed predictor, RNA sequences were encoded by physical-chemical property matrix, mono-nucleotide binary encoding and nucleotide chemical property. Subsequently, these features were optimized by using minimum Redundancy Maximum Relevance (mRMR) feature selection method. Based on the optimal feature subset, the best m6A classification models were trained by Support Vector Machine (SVM) with 5-fold cross-validation test. Prediction results on independent dataset showed that our proposed method could produce the excellent generalization ability. We also established a user-friendly webserver called iRNA-m6A which can be freely accessible at http://lin-group.cn/server/iRNA-m6A. This tool will provide more convenience to users for studying m6A modification in different tissues.
What problem does this paper attempt to address?