Hierarchical Identification of MicroRNA Families for Biomedical Applications

Ke Chen,Quan Zou,Zhiping Peng,Wende Ke,Jinglong Zuo
DOI: https://doi.org/10.1166/jctn.2014.3441
2013-01-01
Journal of Computational and Theoretical Nanoscience
Abstract:MicroRNAs (miRNAs) are short, non-coding RNA molecules that are directly involved in the post-transcriptional regulation of gene expression. Genome-wide association studies have demonstrated that several human miRNA genes at genomic regions have been linked to cancer. Likewise, the absolute expression levels of various miRNAs were significantly reduced in tumors. Biomedical research has attempted to design cancer drugs by constructing novel miRNAs. Family information is important in this reconstruction process, but the prediction of families for novel miRNAs is challenging because of the uneven distribution of family members. Previous studies have focused on distinguishing miRNAs from coding or other non-coding sequences, whereas the prediction of family information was disregarded. In this paper, we focused on the accurate forecasting of family information for novel miRNAs based on primary precursor sequences. Our method hierarchically classified miRNAs and assessed whether a novel miRNA belonged to one of the more popular miRNA families. At each layer, we first extracted n-gram features from the known sequences and then trained a random forest model to classify the novel miRNAs. Experiments on the miRBase data sets demonstrated the efficiency and effectiveness of the proposed model. Furthermore, the family-disease relationships were mined from PubMed and reported. These relationships could be used to analyze the function of novel miRNAs and to design biomedical drugs based on RNA interference.
What problem does this paper attempt to address?