Pose Ensemble Graph Neural Networks to Improve Docking Performances.

Thanawat Thaingtamtanha,Jordane Preto,Francesco Gentile
DOI: https://doi.org/10.26434/chemrxiv-2024-sw04g
2024-11-21
Abstract:The prediction of the geometry and strength governing small molecule-protein interactions remains a paramount challenge in drug discovery due to their complex and dynamic nature. A number of machine learning (ML) methods have been proposed to complement and improve on physics-based tools such as molecular docking, usually by mapping three dimensional features of individual poses to their closeness to experimental structures and/or to binding affinities. Here, we introduce Dockbox2 (DBX2), a novel approach that encodes ensembles of computational poses within a graph neural network architecture via simple energy-based features derived from molecular docking. The model was jointly trained to predict binding pose likelihood as a node-level task and binding affinity as a graph-level task using the PDBbind dataset and demonstrated significant performance in comprehensive, retrospective docking and virtual screening experiments. Our results encourage further exploration of ML models based on conformational ensembles to provide more accurate estimates of small molecule-protein interactions and thermodynamics. The DBX2 code is available at https://github.com/jp43/DockBox2.
Chemistry
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to more accurately predict the binding geometry and binding strength between small molecules and proteins in the drug discovery process. Specifically, the paper focuses on the following key challenges: 1. **Prediction of binding geometry**: In drug design, accurately predicting the binding mode (i.e., binding geometry) between small molecules and proteins is crucial. Although traditional molecular docking methods can generate multiple possible binding conformations, the accuracy of their predictions is often limited. 2. **Prediction of binding affinity**: Binding affinity reflects the strength of the interaction between small molecules and proteins, and is usually represented by the dissociation constant \(K_d\). Although experimental methods for determining binding affinity are accurate, they are time - consuming and costly. Therefore, more efficient and accurate computational methods need to be developed to predict binding affinity. 3. **Consideration of dynamic properties**: The binding between small molecules and proteins is a dynamic process involving multiple conformational changes. Traditional molecular docking methods usually assume that the protein - binding pocket is rigid and ignore the flexibility of the protein, which may lead to inaccurate prediction results. 4. **Application of machine - learning methods**: In recent years, machine - learning methods have been widely used to improve the performance of molecular docking. However, most of the existing machine - learning models are trained based on a single conformation and may not be able to fully capture the complete thermodynamic characteristics and kinetic behaviors of the interaction between small molecules and proteins. To solve the above problems, the paper proposes **DockBox2 (DBX2)**, a new method based on graph neural networks (GNNs) that can encode multiple calculated binding conformations and simultaneously predict the correctness of binding conformations and binding affinity. DBX2 improves existing methods in the following ways: - **Multi - conformation encoding**: DBX2 encodes multiple binding conformations into a graph structure, where each node represents a binding conformation and the edges represent the similarity between conformations (such as RMSD). This method can better capture the dynamic characteristics of the interaction between small molecules and proteins. - **Joint training**: DBX2 simultaneously trains node - level tasks (predicting the correctness of binding conformations) and graph - level tasks (predicting binding affinity), thereby improving the overall performance of the model. - **High - performance verification**: The paper verifies the performance of DBX2 through a series of retrospective experiments, and the results show that DBX2 shows significant improvements in both molecular docking and virtual screening tasks. In summary, this paper aims to improve the prediction accuracy of the binding geometry and binding affinity between small molecules and proteins by introducing the DBX2 model, thereby accelerating the drug discovery process.