Protein Conformation Generation via Force-Guided SE(3) Diffusion Models

Yan Wang,Lihao Wang,Yuning Shen,Yiqun Wang,Huizhuo Yuan,Yue Wu,Quanquan Gu
2024-09-24
Abstract:The conformational landscape of proteins is crucial to understanding their functionality in complex biological processes. Traditional physics-based computational methods, such as molecular dynamics (MD) simulations, suffer from rare event sampling and long equilibration time problems, hindering their applications in general protein systems. Recently, deep generative modeling techniques, especially diffusion models, have been employed to generate novel protein conformations. However, existing score-based diffusion methods cannot properly incorporate important physical prior knowledge to guide the generation process, causing large deviations in the sampled protein conformations from the equilibrium distribution. In this paper, to overcome these limitations, we propose a force-guided SE(3) diffusion model, ConfDiff, for protein conformation generation. By incorporating a force-guided network with a mixture of data-based score models, ConfDiff can generate protein conformations with rich diversity while preserving high fidelity. Experiments on a variety of protein conformation prediction tasks, including 12 fast-folding proteins and the Bovine Pancreatic Trypsin Inhibitor (BPTI), demonstrate that our method surpasses the state-of-the-art method.
Biomolecules,Machine Learning
What problem does this paper attempt to address?
The paper aims to address the problem of protein conformation generation and overcome the shortcomings of existing methods in generating protein conformations. Specifically: 1. **Problems with traditional methods**: Traditional physics-based methods (such as molecular dynamics simulations) face rare event sampling and long-time equilibrium issues in protein conformation sampling, limiting their application in general protein systems. 2. **Limitations of deep generative models**: Existing diffusion model-based methods, while capable of generating novel protein conformations, fail to effectively incorporate important physical priors to guide the generation process, resulting in generated conformations that significantly deviate from the equilibrium distribution. To address these issues, the authors propose a new force-guided SE(3) diffusion model (CONFDIFF) for generating protein conformations. By combining a data-driven score model with a force-guided network, CONFDIFF is able to generate high-fidelity and diverse protein conformations that better conform to the Boltzmann distribution. Experimental results show that this method outperforms current state-of-the-art methods in various protein conformation prediction tasks.