Identifying N6-methyladenosine Sites Using Extreme Gradient Boosting System Optimized by Particle Swarm Optimizer.

Xiaowei Zhao,Ye Zhang,Qiao Ning,Hongrui Zhang,Jinchao Ji,Minghao Yin
DOI: https://doi.org/10.1016/j.jtbi.2019.01.035
IF: 2.405
2019-01-01
Journal of Theoretical Biology
Abstract:N6-methyladenosine (m6A) is the one of the most important RNA modifications, playing the role of splicing events, mRNA exporting and stability to cell differentiation. Because of wide distribution of m6A in genes, identification of m6A sites in RNA sequences has significant importance for basic biomedical research and drug development. High-throughput laboratory methods are time consuming and costly. Nowadays, effective computational methods are much desirable because of its convenience and fast speed. Thus, in this article, we proposed a new method to improve the performance of the m6A prediction by using the combined features of deep features and original features with extreme gradient boosting optimized by particle swarm optimization (PXGB). The proposed PXGB algorithm uses three kinds of features, i.e., position-specific nucleotide propensity (PSNP), position-specific dinucleotide propensity (PSDP), and the traditional nucleotide composition (NC). By 10-fold cross validation, the performance of PXGB was measured with an AUC of 0.8390 and an MCC of 0.5234. Additionally, PXGB was compared with the existing methods, and the higher MCC and AUC of PXGB demonstrated that PXGB was effective to predict m6A sites. The predictor proposed in this study might help to predict more m6A sites and guide related experimental validation.
What problem does this paper attempt to address?