Abstract:The launch of AlphaFold series has brought deep-learning techniques into the molecular structural science. As another crucial problem, structure-based prediction of protein-ligand binding affinity urgently calls for advanced computational techniques. Is deep learning ready to decode this problem? Here we review mainstream structure-based, deep-learning approaches for this problem, focusing on molecular representations, learning architectures and model interpretability. A model taxonomy has been generated. To compensate for the lack of valid comparisons among those models, we realized and evaluated representatives from a uniform basis, with the advantages and shortcomings discussed. This review will potentially benefit structure-based drug discovery and related areas.

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve This paper aims to explore the current state and challenges of applying deep learning techniques in structure-based protein-ligand binding affinity prediction (PLBAP). Specifically, the paper focuses on the following aspects: 1. **Molecular Representation**: How to effectively represent the structural information of proteins and ligands for deep learning models to process. 2. **Learning Architectures**: The learning architectures adopted by current mainstream deep learning models in addressing PLBAP problems and their pros and cons. 3. **Model Interpretability**: How to improve the interpretability of deep learning models to better understand the prediction mechanisms. ### Background The interaction between proteins and ligands is one of the key issues in drug discovery research. Predicting binding affinity is crucial for identifying potential drug candidates. Although traditional molecular docking methods can quickly generate binding poses close to experimental structures, they perform poorly in further tasks such as distinguishing binders from non-binders and ranking ligands. Therefore, developing more effective binding affinity prediction methods is of great significance. ### Research Focus 1. **Convolutional Neural Network Based on Atomic Coordinates and Types (TACNN)**: - Utilizes atomic coordinates and types as inputs to predict binding affinity through atomic type convolution and radial pooling operations. - The model has hierarchical interpretability, from atomic pair interactions to molecular-level energy accumulation, and then to the overall thermodynamic cycle. 2. **Convolutional Neural Network Based on Intermolecular Contacts (TIMC-CNN)**: - Represents protein-ligand interactions as intermolecular contacts and learns these features through 2D-CNN. - Partial interpretability can be achieved by measuring the importance of features in affinity prediction. 3. **Convolutional Neural Network Based on Molecular Grids (TGrid-CNN)**: - Uses molecular grids to represent protein-ligand complexes and learns these grids through 3D-CNN. - Provides some visualization strategies to assess prediction-level interpretability, such as generating heatmaps through masking operations. 4. **Graph Convolutional Network Based on Molecular Graphs (TGraph-GCN)**: - Represents protein-ligand complexes as graphs and learns the features of nodes and edges through graph convolutional networks. - The model can be interpreted at both the model level and the prediction level by measuring the importance of features to understand the prediction mechanism. ### Evaluation To comprehensively evaluate these four types of models (TACNN, TIMC-CNN, TGrid-CNN, and TGraph-GCN), the authors constructed representative models using unified training data and attribute generation rules. The evaluation data includes the PDBbind Refined Set for model training, the Core Set for hyperparameter tuning, and two test sets from the CSAR-HiQ dataset. ### Conclusion By reviewing and evaluating mainstream structure-based deep learning PLBAP models, this paper provides valuable references for research in structured drug discovery and related fields. The paper not only discusses the advantages and disadvantages of various models but also explores how to improve model interpretability and screening performance.

Structure-based, deep-learning models for protein-ligand binding affinity prediction

Prediction of protein–ligand binding affinity via deep learning models

Scoring Functions for Protein-Ligand Binding Affinity Prediction Using Structure-based Deep Learning: A Review

Deep learning in modelling the protein–ligand interaction: new pathways in drug development

Development and evaluation of a deep learning model for protein-ligand binding affinity prediction

AI-Driven Deep Learning Techniques in Protein Structure Prediction

Improved Protein–Ligand Binding Affinity Prediction with Structure-Based Deep Fusion Inference

DEELIG: A Deep Learning Approach to Predict Protein-Ligand Binding Affinity

Machine Learning for Sequence and Structure-Based Protein–Ligand Interaction Prediction

DeepRLI: A Multi-objective Framework for Universal Protein--Ligand Interaction Prediction

Protein-ligand binding affinity prediction: Is 3D binding pose needed?

Deep Learning in Protein Structural Modeling and Design

Binding Affinity Prediction with 3D Machine Learning: Training Data and Challenging External Testing

Benchmarking AlphaFold for protein complex modeling reveals accuracy determinants

DeepDTAF: a deep learning method to predict protein–ligand binding affinity

On Machine Learning Approaches for Protein-Ligand Binding Affinity Prediction

Exploring protein–ligand binding affinity prediction with electron density-based geometric deep learning

Enhancing Drug-Target Binding Affinity Prediction through Deep Learning and Protein Secondary Structure Integration

Protein-RNA interaction prediction with deep learning: Structure matters

Do Deep Learning Models for Co-Folding Learn the Physics of Protein-Ligand Interactions?

State-specific protein-ligand complex structure prediction with a multi-scale deep generative model