Abstract:Molecular docking, the task of predicting the binding structures between a protein and a small molecule ligand, plays a significant role in structural-based drug discovery. In recent years, numerous deep learning-based methods for molecular docking have emerged. State-of-the-art approaches such as DiffDock formulate the docking problem using diffusion generative models, exhibiting superior performance than traditional docking algorithms. However, despite the strong performance of these deep learning-based docking methods in predicting binding poses, they often lack a well-defined scoring function. This limitation poses challenges in effectively distinguishing between the strong and weak inhibitors during virtual screening. To address this limitation, we introduce FeatureDock, a transformer-based deep learning framework, which can accurately predict the protein-ligand binding poses as well as achieve a strong scoring power for virtual screening. FeatureDock extracts chemical features from local environments within protein structures and utilizes a Transformer encoder to predict probability density envelopes indicating where ligands are most likely to bind in the protein pocket. We also designed a scoring function, which encodes the predicted probability density envelope, to optimize and score the ligand poses. In addition, the attention mechanism in FeatureDock’s Transformer further enhances the model’s interpretability by providing the attention weights of each chemical feature from the protein structures in predicting the binding probabilities. When applied to virtual screening, we demonstrated that FeatureDock outperforms DiffDock, Smina and AutoDock Vina in distinguishing strong inhibitors from weak ones for both Cyclin-Dependent Kinase 2 (CDK2, an inactivated form) and Angiotensin-converting enzyme (ACE). The performance was assessed using Kullback–Leibler (KL) divergence and area under receiver operating characteristic (AUC) evaluation metrics. We also demonstrate that FeatureDock can accurately predict the binding poses, achieving an average RMSD of 2.4 Å when compared to CDK2-ligand co-crystal structures. We anticipate that our FeatureDock holds promise to be widely applied in virtual screening to assist in drug design. FeatureDock is available at https://github.com/xuhuihuang/featuredock.

DrugCLIP: Contrastive Protein-Molecule Representation Learning for Virtual Screening

A Case-Based Meta-Learning Algorithm Boosts the Performance of Structure-Based Virtual Screening.

Multimodal Protein-Ligand Contrastive Pretraining for Effective and Efficient Drug Discovery

Deep contrastive learning enables genome-wide virtual screening

Enhancing Challenging Target Screening via Multimodal Protein-Ligand Contrastive Learning

Hashing based Contrastive Learning for Virtual Screening

S-MolSearch: 3D Semi-supervised Contrastive Learning for Bioactive Molecule Search

GraphCL-DTA: a graph contrastive learning with molecular semantics for drug-target binding affinity prediction

Contrastive learning in protein language space predicts interactions between drugs and protein targets

BigBind: Learning from Nonstructural Data for Structure-Based Virtual Screening

Efficient Exploration of Chemical Space with Docking and Deep Learning

Supervised graph co-contrastive learning for drug–target interaction prediction

An artificial intelligence accelerated virtual screening platform for drug discovery

ClusterX: a novel representation learning-based deep clustering framework for accurate visual inspection in virtual screening

FeatureDock: Protein-Ligand Docking Guided by Physicochemical Feature-Based Local Environment Learning using Transformer

Computational representations of protein–ligand interfaces for structure-based virtual screening

DrugCLIP: Contrastive Drug-Disease Interaction For Drug Repurposing

Advancing Ligand Docking through Deep Learning: Challenges and Prospects in Virtual Screening

Docking-based Virtual Screening with Multi-Task Learning

Artificial intelligence-enabled virtual screening of ultra-large chemical libraries with deep docking

QuickBind: A Light-Weight And Interpretable Molecular Docking Model