Protein Subcellular Localization Based on PSI-BLAST and Machine Learning.

Jian Guo,Xian Pu,Yuanlie Lin,Howard Leung
DOI: https://doi.org/10.1142/s0219720006002405
2006-01-01
Journal of Bioinformatics and Computational Biology
Abstract:Subcellular location is an important functional annotation of proteins. An automatic, reliable and efficient prediction system for protein subcellular localization is necessary for large-scale genome analysis. This paper describes a protein subcellular localization method which extracts features from protein profiles rather than from amino acid sequences. The protein profile represents a protein family, discards part of the sequence information that is not conserved throughout the family and therefore is more sensitive than the amino acid sequence. The amino acid compositions of whole profile and the N-terminus of the profile are extracted, respectively, to train and test the probabilistic neural network classifiers. On two benchmark datasets, the overall accuracies of the proposed method reach 89.1% and 68.9%, respectively. The prediction results show that the proposed method perform better than those methods based on amino acid sequences. The prediction results of the proposed method are also compared with Subloc on two redundance-reduced datasets.
What problem does this paper attempt to address?