Dockformer: A transformer-based molecular docking paradigm for large-scale virtual screening

Zhangfan Yang,Junkai Ji,Shan He,Jianqiang Li,Ruibin Bai,Zexuan Zhu,Yew Soon Ong
2024-11-11
Abstract:Molecular docking enables virtual screening of compound libraries to identify potential ligands that target proteins of interest, a crucial step in drug development; however, as the size of the compound library increases, the computational complexity of traditional docking models increases. Deep learning algorithms can provide data-driven research and development models to increase the speed of the docking process. Unfortunately, few models can achieve superior screening performance compared to that of traditional models. Therefore, a novel deep learning-based docking approach named Dockformer is introduced in this study. Dockformer leverages multimodal information to capture the geometric topology and structural knowledge of molecules and can directly generate binding conformations with the corresponding confidence measures in an end-to-end manner. The experimental results show that Dockformer achieves success rates of 90.53\% and 82.71\% on the PDBbind core set and PoseBusters benchmarks, respectively, and more than a 100-fold increase in the inference process speed, outperforming almost all state-of-the-art docking methods. In addition, the ability of Dockformer to identify the main protease inhibitors of coronaviruses is demonstrated in a real-world virtual screening scenario. Considering its high docking accuracy and screening efficiency, Dockformer can be regarded as a powerful and robust tool in the field of drug design.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the computational complexity and time cost issues in Large-Scale Virtual Screening (LSVS) tasks for Molecular Docking. Traditional molecular docking methods face major challenges in terms of computational cost and time consumption when dealing with large compound libraries. Although deep learning algorithms can improve docking speed, existing deep learning models still cannot surpass traditional methods in terms of docking accuracy and screening speed. Specifically, the paper proposes a new method based on Transformer—Dockformer, to solve the following problems: 1. **Improve Docking Accuracy**: Traditional optimization methods, although widely used in modern drug design, suffer from insufficient docking accuracy due to imprecise scoring functions and optimization algorithms that cannot guarantee finding the global optimal solution. 2. **Accelerate the Screening Process**: Traditional docking methods require multiple independent optimization processes to sample possible binding conformations, leading to extremely high computational costs in large-scale virtual screening tasks. 3. **Generate High-Confidence Binding Conformations**: Existing deep learning methods often ignore the topological information of molecules when generating binding conformations, resulting in physically infeasible conformations. By introducing Dockformer, the paper hopes to significantly improve screening speed while maintaining high docking accuracy, thereby meeting the high-throughput requirements of large-scale virtual screening tasks.