Predicting Human Microrna Precursors Based on an Optimized Feature Subset Generated by GA-SVM.
Yanqiu Wang,Xiaowen Chen,Wei Jiang,Li,Wei Li,Lei Yang,Mingzhi Liao,Baofeng Lian,Yingli Lv,Shiyuan Wang,Shuyuan Wang,Xia Li
DOI: https://doi.org/10.1016/j.ygeno.2011.04.011
IF: 4.31
2011-01-01
Genomics
Abstract:MicroRNAs (miRNAs) are non-coding RNAs that play important roles in post-transcriptional regulation. Identification of miRNAs is crucial to understanding their biological mechanism. Recently, machine-learning approaches have been employed to predict miRNA precursors (pre-miRNAs). However, features used are divergent and consequently induce different performance. Thus, feature selection is critical for pre-miRNA prediction. We generated an optimized feature subset including 13 features using a hybrid of genetic algorithm and support vector machine (GA–SVM). Based on SVM, the classification performance of the optimized feature subset is much higher than that of the two feature sets used in microPred and miPred by five-fold cross-validation. Finally, we constructed the classifier miR-SF to predict the most recently identified human pre-miRNAs in miRBase (version 16). Compared with microPred and miPred, miR-SF achieved much higher classification performance. Accuracies were 93.97%, 86.21% and 64.66% for miR-SF, microPred and miPred, respectively. Thus, miR-SF is effective for identifying pre-miRNAs.