Prediction of Protein Amidation Sites by Feature Selection and Analysis

Weiren Cui,Shen Niu,Lulu Zheng,Lele Hu,Tao Huang,Lei Gu,Kaiyan Feng,Ning Zhang,Yudong Cai,Yixue Li
DOI: https://doi.org/10.1007/s00438-013-0760-x
IF: 2.98
2013-01-01
Molecular Genetics and Genomics
Abstract:Carboxy-terminal α-amidation is a widespread post-translational modification of proteins found widely in vertebrates and invertebrates. The α-amide group is required for full biological activity, since it may render a peptide more hydrophobic and thus better be able to bind to other proteins, preventing ionization of the C-terminus. However, in particular, the C-terminal amidation is very difficult to detect because experimental methods are often labor-intensive, time-consuming and expensive. Therefore, in silico methods may complement due to their high efficiency. In this study, a computational method was developed to predict protein amidation sites, by incorporating the maximum relevance minimum redundancy method and the incremental feature selection method based on the nearest neighbor algorithm. From a total of 735 features, 41 optimal features were selected and were utilized to construct the final predictor. As a result, the predictor achieved an overall Matthews correlation coefficient of 0.8308. Feature analysis showed that PSSM conservation scores and amino acid factors played the most important roles in the α-amidation site prediction. Site-specific feature analyses showed that features derived from the amidation site itself and adjacent sites were most significant. This method presented could be used as an efficient tool to theoretically predict amidated peptides. And the selected features from our study could shed some light on the in-depth understanding of the mechanisms of the amidation modification, providing guidelines for experimental validation.
What problem does this paper attempt to address?