MolSnapper: Conditioning Diffusion for Structure Based Drug Design

Yael Ziv,Brian Marsden,Charlotte Deane
DOI: https://doi.org/10.1101/2024.03.28.586278
2024-09-14
Abstract:Generative models have emerged as potentially powerful methods for molecular design, yet challenges persist in generating molecules that effectively bind to the intended target. The ability to control the design process and incorporate prior knowledge would be highly beneficial for better tailoring molecules to fit specific binding sites. In this paper, we introduce MolSnapper, a novel tool that is able to condition diffusion models for structure-based drug design by seamlessly integrating expert knowledge in the form of 3D pharmacophores. We demonstrate through comprehensive testing on both CrossDocked and Binding MOAD datasets, that our method generates molecules better tailored to fit a given binding site, achieving high structural and chemical similarity to the original molecules. It also, when compared to alternative methods, yields approximately twice as many valid molecules.
Bioinformatics
What problem does this paper attempt to address?
### Problems Addressed by the Paper This paper aims to address several key challenges in Structure-Based Drug Design (SBDD): 1. **Generating molecules that effectively bind to target proteins**: Existing generative models, while showing potential in molecular design, still face difficulties in generating molecules that can effectively bind to target proteins. 2. **Controlling the design process and integrating prior knowledge**: The ability to incorporate expert knowledge (such as 3D pharmacophores) during the design process to better tailor molecules to fit specific binding sites. 3. **Improving the quality and feasibility of generated molecules**: The generated molecules need to not only be structurally and chemically similar to the original molecules but also possess synthetic feasibility and physical plausibility. ### Solution To address these challenges, the authors propose MolSnapper, a novel tool that conditions diffusion models with expert knowledge such as 3D pharmacophores to achieve structure-based drug design. Specifically: - **Conditioned diffusion model**: MolSnapper introduces 3D pharmacophore positions and types as constraints during the generation process, adjusting the parameterized distribution of the diffusion model to generate molecules that better meet expectations. - **Preventing molecular conflicts with proteins**: By introducing a conflict-guided loss function, it ensures that the generated molecules do not physically conflict with or get too close to the proteins. - **Flexible constraint types**: Users can select or automatically extract different types of constraints based on specific needs, such as constraints extracted from fragment screening experiments. ### Experimental Results Through comprehensive testing on the CrossDocked and Binding MOAD datasets, MolSnapper demonstrated the following advantages: - **Generated molecules are closer to reference molecules**: In the SC RDKit score, molecules generated by MolSnapper significantly outperformed those generated by other methods. - **Higher synthetic feasibility and physical plausibility**: Molecules generated by MolSnapper have higher synthetic accessibility (SA score) and a higher PoseBusters pass rate. - **Better hydrogen bond regeneration capability**: Molecules generated by MolSnapper can better reproduce the hydrogen bond interactions between the reference ligand and the protein. In summary, MolSnapper significantly improves the quality and feasibility of generated molecules through conditioned diffusion models and flexible constraint strategies, providing a new solution for structure-based drug design.