Predicting Protein Subchloroplast Locations with Both Single and Multiple Sites Via Three Different Modes of Chou's Pseudo Amino Acid Compositions.

Chao Huang,Jing-Qi Yuan
DOI: https://doi.org/10.1016/j.jtbi.2013.06.034
IF: 2.405
2013-01-01
Journal of Theoretical Biology
Abstract:Owing to the fact that location information can indicate important functionalities of proteins, developing computational tools to predict protein subcellular localization is one of the most efficient and meaningful tasks with no doubt. The existence methods dealing with prediction of protein subchloroplast locations can only handle the case of single location site. Therefore, it is meaningful and challenging to make effort in how to deal with the proteins with multiple subchloroplast location sites instead of just excluding them. To solve this problem, new systems for predicting protein subchloroplast localization with single or multiple sites are developed and discussed in the paper. Three different editions of KNN algorithms and four different types of feature extraction were adopted to construct the prediction systems. This is the first effort to predict the proteins with multiple subchloroplast locations. The overall jackknife success rates achieved by the best combination (features+classifier) on three dataset with different levels of homology were 89.08%, 81.29% and 71.11%. The performance of the prediction models indicate that the proposed methods might be applied as a useful and efficient assistant tool for the prediction of sub-subcellular localizations.
What problem does this paper attempt to address?