A Novel Method for Protein Subcellular Localization: Combining Residue-Couple Model and SVM.

Jian Guo,Yuanlie Lin,Zhirong Sun
DOI: https://doi.org/10.1142/9781860947322_0012
2005-01-01
Abstract:Subcellular localization performs an important role in genome analysis as a key functional characteristic of proteins. Therefore, an automatic, reliable and efficient prediction system for protein subcellular localization is needed for large-scale genome analysis. This paper describes a new residue-couple model using a support vector machine to predict the subcellular localization of proteins. This new approach provides better predictions than existing methods. The total prediction accuracies on Reinhardt and Hubbard’s dataset reach 92.0% for prokaryotic protein sequences and 86.9% for eukaryotic protein sequences with 5-fold cross validation. For a new dataset with 8304 proteins located in 8 subcellular locations, the total accuracy achieves 88.9%. The model shows robust against N-terminal errors in the sequences. A web server is developed based on the method which was used to predict some new proteins.
What problem does this paper attempt to address?