Assessing interaction recovery of predicted protein-ligand poses

David Errington,Constantin Schneider,Cédric Bouysset,Frédéric A. Dreyer
2024-09-30
Abstract:The field of protein-ligand pose prediction has seen significant advances in recent years, with machine learning-based methods now being commonly used in lieu of classical docking methods or even to predict all-atom protein-ligand complex structures. Most contemporary studies focus on the accuracy and physical plausibility of ligand placement to determine pose quality, often neglecting a direct assessment of the interactions observed with the protein. In this work, we demonstrate that ignoring protein-ligand interaction fingerprints can lead to overestimation of model performance, most notably in recent protein-ligand cofolding models which often fail to recapitulate key interactions.
Biomolecules,Machine Learning
What problem does this paper attempt to address?
The paper primarily aims to address the issue of evaluating the performance of different methods in recovering key protein-ligand interactions in the prediction of protein-ligand complex structures. Specifically: 1. **Problem Background**: In recent years, machine learning-based methods have made significant progress in predicting protein-ligand binding poses. These methods are often used to replace traditional docking methods and can even directly predict the full-atom protein-ligand complex structures. However, existing studies mostly focus on the accuracy of ligand positioning and its physical plausibility, while neglecting the direct evaluation of observed protein-ligand interactions. 2. **Research Objective**: This paper demonstrates through experiments that ignoring protein-ligand interaction fingerprints (PLIFs) can lead to an overestimation of model performance, especially in some recent protein-ligand co-folding models, which often fail to reproduce key interactions. Therefore, the authors propose using PLIFs as a useful metric for evaluating model quality and benchmark various modern pose prediction tools with it. 3. **Method Comparison**: The paper compares several classical docking algorithms (e.g., GOLD), machine learning docking algorithms (e.g., DiffDock-L), and protein-ligand co-folding models (e.g., RoseTTAFold-AllAtom). It finds that classical docking algorithms generally perform better than machine learning methods in recovering key interactions. Classical methods are particularly more effective in recovering important interactions such as hydrogen bonds. 4. **Conclusion**: By introducing the new metric of PLIF recovery rate, the paper emphasizes that in drug discovery applications, in addition to focusing on RMSD and PoseBuster effectiveness, it is also important to pay attention to the recovery of protein-ligand interactions. This provides a direction for future improvements in machine learning models, such as incorporating explicit PLIF or pharmacophore-sensitive loss functions during training.