Prediction of nucleic acid-binding proteins using support vector machines
Yuan Youlang,Liu Liang,Niu Bing,Lu Wencong,Cai Yudong
DOI: https://doi.org/10.3969/j.issn.1001-4160.2010.02.005
2010-01-01
Abstract:In this work,we integrated SVMs,protein sequence amino acid composition,and associated physicochemical properties into the study of nucleic-acid-binding proteins prediction.We developed the binary classification for rRNA-,RNA-,DNA-binding proteins that play an important role in the control of many cell processes.Each SVM model can be used to predict whether a protein belongs to rRNA-,RNA-,or DNA-binding protein class.10-crossvalidation was performed on the protein data sets in which the sequences identity was~40%.Test results show that the accuracies of SVM models for rRNA-,RNA-,DNA-binding proteins are 93.75%,83.41%, 81.85%,respectively.The predictions were also performed on the test data set.The results agree well with the prior knowledge of those proteins and show the effectiveness of physicochemical properties of sequence in the protein function prediction.On the basis of our work,an online server for predicting the nucleic acid-binding proteins using SVM was available on http://chemdata.shu.edu.cn/ protein_na.