Prediction of Protein-Protein Interaction Using Distance Frequency of Amino Acids Grouped with Their Physicochemical Properties
Shao-Wu Zhang,Yongmei Cheng,Li Luo,Quan Pan
DOI: https://doi.org/10.1109/bic-ta.2011.53
2011-01-01
Abstract:Protein-protein interactions (PPIs) play a key role in many cellular processes. These interactions form the basis of phenomena such as DNA replication and transcription, metabolic pathway, signaling pathway, and cell cycle control. Knowing how proteins interact with each other can help the biological scientist understand the molecular mechanism of the cell. Unfortunately, the experimental methods of identifying PPIs are both time-consuming and expensive. Therefore, developing computational approaches for predicting PPIs would be of significant value. Here, we propose a novel method for predicting the PPI using distance frequency of amino acids grouped with their physicochemical properties (hydrophobicity, normalized van der Waals volume, polarity and polarizability) and PCA. First, the 20 basic amino acids were divided into three groups according to the four kinds of physicochemical property values. Second, the distance frequency feature extraction method was introduced to represent the protein pairs, and also fused the feature vectors extracted with four physicochemical properties to form different feature vector sets. Third, the PCA method was used to reduce the vector dimension, and support vector machine was adopted as the classifier. The overall success rate of our method for hydrophobicity, normalized van der Waals volume, polarity and polarizability are 89.88%, 89.72%, 89.28% and 89.24% in 10CV test, which are 6.65%, 8.05%, 9.72% and 8.09% higher than that of Guo's auto-covariance function feature extraction method respectively. The total predicting accuracy of fusing the four physicochemical properties arrives at 91.79%. The results show that the current approach is very promising for predicting PPI, and may become a useful tool in the relevant areas.