SSIPe: Accurately Estimating Protein-Protein Binding Affinity Change Upon Mutations Using Evolutionary Profiles in Combination with an Optimized Physical Energy Function

Xiaoqiang Huang,Wei Zheng,Robin Pearce,Yang Zhang
DOI: https://doi.org/10.1093/bioinformatics/btz926
IF: 5.8
2019-01-01
Bioinformatics
Abstract:Motivation: Most proteins perform their biological functions through interactions with other proteins in cells. Amino acid mutations, especially those occurring at protein interfaces, can change the stability of protein-protein interactions (PPIs) and impact their functions, which may cause various human diseases. Quantitative estimation of the binding affinity changes (Delta Delta G(bind)) caused by mutations can provide critical information for protein function annotation and genetic disease diagnoses. Results: We present SSIPe, which combines protein interface profiles, collected from structural and sequence homology searches, with a physics-based energy function for accurate Delta Delta G(bind) estimation. To offset the statistical limits of the PPI structure and sequence databases, amino acid-specific pseudocounts were introduced to enhance the profile accuracy. SSIPe was evaluated on large-scale experimental data containing 2204 mutations from 177 proteins, where training and test datasets were stringently separated with the sequence identity between proteins from the two datasets below 30%. The Pearson correlation coefficient between estimated and experimental Delta Delta G(bind) was 0.61 with a root-mean-square-error of 1.93kcal/mol, which was significantly better than the other methods. Detailed data analyses revealed that the major advantage of SSIPe over other traditional approaches lies in the novel combination of the physical energy function with the new knowledge-based interface profile. SSIPe also considerably outperformed a former profile-based method (BindProfX) due to the newly introduced sequence profiles and optimized pseudocount technique that allows for consideration of amino acid-specific prior mutation probabilities.
What problem does this paper attempt to address?