Functional Classification of Secreted Proteins by Position Specific Scoring Matrix and Auto Covariance
Jiesi Luo,Lezheng Yu,Yanzhi Guo,Menglong Li
DOI: https://doi.org/10.1016/j.chemolab.2011.11.008
IF: 4.175
2012-01-01
Chemometrics and Intelligent Laboratory Systems
Abstract:Secreted proteins play pivotal biological regulatory roles in eukaryotic cells with different functions, and have the potential for disease biomarkers and protein therapeutics. Furthermore, the comprehension of their functions is not only indispensable for helping genome annotation, but also a supplement to the existing methods. As the rapid increase of protein sequences generated in the post-genomic age, it is urgent to develop a computational method to effectively annotate the function types of numerous newly discovered secreted proteins. In view of this, a support vector machine (SVM)-based predictor is proposed to classify secreted proteins with different functions in this paper, including cytokine, hormone, immune system protein, protease, and protease inhibitor. Here, proteins are represented by position specific scoring matrix (PSSM) and auto covariance (AC), which incorporates the evolution and the sequence-order information of proteins. When distinguishing the five types of secreted proteins, an accuracy of 83.2%, 88.9%, 86.1%, 90.9%, and 90.6% is achieved for cytokine, hormone, immune system protein, protease and protease inhibitor, respectively. Particularly, when performed on an independent test set of 325 proteins, the method also yielded a satisfactory accuracy of 91.2%. It shows that this method can be a complementary tool for identifying different functions of secreted proteins. The code and all datasets used in this article are freely available at http://cic.scu.edu.cn/bioinformatics/fcSecretP.zip.