Equivariant Scalar Fields for Molecular Docking with Fast Fourier Transforms

Bowen Jing,Tommi Jaakkola,Bonnie Berger
2024-09-02
Abstract:Molecular docking is critical to structure-based virtual screening, yet the throughput of such workflows is limited by the expensive optimization of scoring functions involved in most docking algorithms. We explore how machine learning can accelerate this process by learning a scoring function with a functional form that allows for more rapid optimization. Specifically, we define the scoring function to be the cross-correlation of multi-channel ligand and protein scalar fields parameterized by equivariant graph neural networks, enabling rapid optimization over rigid-body degrees of freedom with fast Fourier transforms. The runtime of our approach can be amortized at several levels of abstraction, and is particularly favorable for virtual screening settings with a common binding pocket. We benchmark our scoring functions on two simplified docking-related tasks: decoy pose scoring and rigid conformer docking. Our method attains similar but faster performance on crystal structures compared to the widely-used Vina and Gnina scoring functions, and is more robust on computationally predicted structures. Code is available at <a class="link-external link-https" href="https://github.com/bjing2016/scalar-fields" rel="external noopener nofollow">this https URL</a>.
Biomolecules,Machine Learning
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper aims to address the computational efficiency issue in molecular docking. Specifically, molecular docking is a key step in structure-based virtual screening, but the optimization process of scoring functions in most docking algorithms is very time-consuming, limiting the throughput of the workflow. To accelerate this process, the authors explore how to learn scoring functions through machine learning methods, making their form allow for faster optimization. ### Main Contributions 1. **Proposed a new scoring function**: Based on the cross-correlation of scalar fields to accelerate the optimization of ligand poses in molecular docking. 2. **Designed neural network parameterization and training methods**: For learning equivariant scalar fields of molecules. 3. **Demonstrated the performance and runtime of the scoring function**: Showing that it outperforms or is comparable to existing methods in molecular docking-related tasks, with faster execution speed. ### Method Overview - **Equivariant Scalar Fields (ESFs)**: Parameterizing the scalar fields of proteins and ligands through equivariant graph neural networks (E3NNs), ensuring invariance under rigid body transformations. - **Fast Fourier Transform (FFT)**: Utilizing FFT to simultaneously evaluate the scores of a large number of ligand poses in translational space \( \mathbb{R}^3 \) and rotational space \( \text{SO}(3) \), significantly accelerating the optimization process. - **Training and Inference**: Decomposing ligand poses into zero-mean conformations, rotations, and translations through conditional log-likelihood decomposition, optimizing these parts separately to train the model. During inference, rapidly evaluating and optimizing candidate ligand poses using the FFT method. ### Experimental Results - **Benchmarking**: Evaluated on two simplified docking tasks, including decoy pose scoring and rigid conformation docking. Results show that the method performs comparably to widely used Vina and Gnina scoring functions on crystal structures but performs better on predicted structures. - **Virtual Screening Setup**: On the PDE10A test set, with only one unique protein structure, the method achieved a 50-fold total inference time acceleration without losing accuracy. ### Conclusion The paper proposes a new method based on equivariant scalar fields and fast Fourier transform, which significantly improves the computational efficiency of molecular docking while maintaining the quality of the scoring function, making it suitable for large-scale structure-based virtual screening.