Fast Fourier Transform-Based Support Vector Machine for Subcellular Localization Prediction Using Different Substitution Models

Zhimeng Wang,Lin Jiang,Menglong Li,Lina Sun,Rongying Lin
DOI: https://doi.org/10.1111/j.1745-7270.2007.00326.x
IF: 3.7
2007-01-01
Acta Biochimica et Biophysica Sinica
Abstract:There are approximately 10(9) proteins in a cell. A hotspot in bioinformatics is how to identify a protein subcellular localization, if its sequence is known. In this paper, a method using fast Fourier transform-based support vector machine is developed to predict the subcellular localization of proteins from their physicochemical properties and structural parameters. The prediction accuracies reached 83% in prokaryotic organisms and 84% in eukaryotic organisms with the substitution model of the c-p-v matrix (c, composition; p, polarity; and v, molecular volume). The overall prediction accuracy was also evaluated using the "leave-one-out" jackknife procedure. The influence of the substitution model on prediction accuracy has also been discussed in the work. The source code of the new program is available on request from the authors.
What problem does this paper attempt to address?