Compact assessment of molecular surface complementarities enhances neural network-aided prediction of key binding residues

Greta Grassmann,Lorenzo Di Rienzo,Giancarlo Ruocco,Mattia Miotto,Edoardo Milanetti
2024-07-31
Abstract:Predicting interactions between biomolecules, such as protein-protein complexes, remains a challenging problem. Despite the many advancements done so far, the performances of docking protocols are deeply dependent on their capability of identify binding regions. In this context, we present a novel approach that builds upon our previous works modeling protein surface patches via sets of orthogonal polynomials to identify regions of high shape/electrostatic complementarity. By incorporating another key binding property, such as the balance between hydrophilic and hydrophobic contributions, we define new binding matrices that serve an effective inputs for training a neural network. Our approach also allows for the quantitative definition of a typical binding site area - approximately 10Å~in radius - where hydrophobic contribution and shape complementarity, which reflects the Lennard-Jones interaction, are maximized. Using this new architecture, CIRNet (Core Interacting Residues Network), we achieve an accuracy of approximately 0.82 in identifying pairs of core interacting residues on a balanced dataset. In a blind search for core interacting residues, CIRNet distinguishes these from decoys with a ROC AUC of 0.72. This protocol can enahnce docking algorithms by rescaling the proposed poses. When applied to the top ten models from three popular docking server, CIRNet improves docking outcomes, reducing the the average RMSD between the refined poses and the native state by up to 58%.
Biomolecules
What problem does this paper attempt to address?
The problem this paper attempts to address is the prediction of interactions between biomolecules, specifically the identification of binding sites in protein-protein complexes. Despite significant progress in this field, existing docking protocols still face challenges in identifying binding regions. This paper proposes a new approach to enhance the neural network's ability to predict key binding residues by combining the complementarity of shape, charge distribution, and hydrophobic/hydrophilic properties. Specifically, the paper proposes the following points: 1. **Shape Complementarity**: Utilizing orthogonal polynomials (such as Zernike polynomials) to describe the shape of protein surface regions to assess their complementarity. 2. **Charge Distribution Complementarity**: Calculating the electrostatic potential of the molecular surface using the Poisson-Boltzmann equation and assessing its complementarity with the Zernike method. 3. **Hydrophobic/Hydrophilic Properties**: Defining a new hydrophobicity index to evaluate the hydrophobic/hydrophilic balance of the binding interface. These features are integrated into a new binding matrix, which serves as the input for the neural network, training it to recognize core binding residues. Through this method, the authors aim to improve the accuracy of docking algorithms, particularly in identifying core binding residues and optimizing docking models. The main contributions of the paper include: - Proposing a new method to predict protein binding sites by comprehensively evaluating the complementarity of shape, charge distribution, and hydrophobic/hydrophilic properties. - Developing CIRNet (Core Interacting Residues Network), which achieved an accuracy of approximately 0.82 on a balanced dataset and a ROC AUC of 0.72 in blind tests. - Significantly improving the accuracy of docking results by applying CIRNet's predictions to the top 10 models from three popular docking servers (ClusPro, PyDock, and LzerD), with an average RMSD reduction of 58%.