RED-ML: a Novel, Effective RNA Editing Detection Method Based on Machine Learning
Heng Xiong,Dongbing Liu,Qiye Li,Mengyue Lei,Liqin Xu,Liang Wu,Zongji Wang,Shancheng Ren,Wangsheng Li,Min Xia,Lihua Lu,Haorong Lu,Yong Hou,Shida Zhu,Xin Liu,Yinghao Sun,Jian Wang,Huanming Yang,Kui Wu,Xun Xu,Leo J Lee
DOI: https://doi.org/10.1093/gigascience/gix012
IF: 7.658
2017-01-01
GigaScience
Abstract:With the advancement of second generation sequencing techniques, our ability to detect and quantify RNA editing on a global scale has been vastly improved. As a result, RNA editing is now being studied under a growing number of biological conditions so that its biochemical mechanisms and functional roles can be further understood. However, a major barrier that prevents RNA editing from being a routine RNA-seq analysis, similar to gene expression and splicing analysis, for example, is the lack of user-friendly and effective computational tools. Based on years of experience of analyzing RNA editing using diverse RNA-seq datasets, we have developed a software tool, RED-ML: RNA Editing Detection based on Machine learning (pronounced as “red ML”). The input to RED-ML can be as simple as a single BAM file, while it can also take advantage of matched genomic variant information when available. The output not only contains detected RNA editing sites, but also a confidence score to facilitate downstream filtering. We have carefully designed validation experiments and performed extensive comparison and analysis to show the efficiency and effectiveness of RED-ML under different conditions, and it can accurately detect novel RNA editing sites without relying on curated RNA editing databases. We have also made this tool freely available via GitHub . We have developed a highly accurate, speedy and general-purpose tool for RNA editing detection using RNA-seq data. With the availability of RED-ML, it is now possible to conveniently make RNA editing a routine analysis of RNA-seq. We believe this can greatly benefit the RNA editing research community and has profound impact to accelerate our understanding of this intriguing posttranscriptional modification process.