Prediction of Hot Spots Residues in Protein–protein Interface Using Network Feature and Microenvironment Feature

Ling Ye,Qifan Kuang,Lin Jiang,Jiesi Luo,Yanping Jiang,Zhanling Ding,Yizhou Li,Menglong Li
DOI: https://doi.org/10.1016/j.chemolab.2013.11.010
IF: 4.175
2013-01-01
Chemometrics and Intelligent Laboratory Systems
Abstract:Hot spots residues in protein–protein interface play crucial roles in protein binding. In the present study, complex network method was applied to uncover influence of neighboring residues on hot spots and then several network and microenvironment features were designed to describe the diversity of environment of hot spots. After feature analysis by permutation importance in Random Forest (RF), an optimal 58-dimensional feature set including ten network and microenvironment features was selected and then applied to construct a Support Vector Machine (SVM) prediction model for hot spots. A satisfactory accuracy (ACC) value of 79.0% and a Mathew's correlation coefficient (MCC) value of 0.470 were obtained for independent test set. The novel network features and microenvironment features were proved to be promising in discovering hot spots in interfaces. A further microenvironment analysis was also performed. Amino acid residues directly contacting with hot spots in residue–residue interaction network exhibit significant importance for the microenvironment of hot spots. Amino acid alanine (A), aspartic acid (D), glycine (G), histidine (H), isoleucine (I), asparagine (N), serine (S) and tyrosine (Y) are more likely to occur in the vicinity of hot spots than in the vicinity of non-hot spots. These amino acid residues probably cluster together to construct a proper microenvironment for hot spots.
What problem does this paper attempt to address?