Abstract:Most of the biological processes are governed through specific protein-ligand interactions. Discerning different components that contribute toward a favorable protein- ligand interaction could contribute significantly toward better understanding protein function, rationalizing drug design and obtaining design principles for protein engineering. The Protein Data Bank (PDB) currently hosts the structure of ∼68 000 protein-ligand complexes. Although several databases exist that classify proteins according to sequence and structure, a mere handful of them annotate and classify protein-ligand interactions and provide information on different attributes of molecular recognition. In this study, an exhaustive comparison of all the biologically relevant ligand-binding sites (84 846 sites) has been conducted using PocketMatch: a rapid, parallel, in-house algorithm. PocketMatch quantifies the similarity between binding sites based on structural descriptors and residue attributes. A similarity network was constructed using binding sites whose PocketMatch scores exceeded a high similarity threshold (0.80). The binding site similarity network was clustered into discrete sets of similar sites using the Markov clustering (MCL) algorithm. Furthermore, various computational tools have been used to study different attributes of interactions within the individual clusters. The attributes can be roughly divided into (i) binding site characteristics including pocket shape, nature of residues and interaction profiles with different kinds of atomic probes, (ii) atomic contacts consisting of various types of polar, hydrophobic and aromatic contacts along with binding site water molecules that could play crucial roles in protein-ligand interactions and (iii) binding energetics involved in interactions derived from scoring functions developed for docking. For each ligand-binding site in each protein in the PDB, site similarity information, clusters they belong to and description of site attributes are provided as a relational database-protein-ligand interaction clusters (PLIC). Database URL: http://proline.biochem.iisc.ernet.in/PLIC.

PLINDER: The protein-ligand interactions dataset and evaluation resource

Multimodal Protein-Ligand Contrastive Pretraining for Effective and Efficient Drug Discovery

PINDER: The protein interaction dataset and evaluation resource

PLAS-20k: Extended Dataset of Protein-Ligand Affinities from MD Simulations for Machine Learning Applications

PLIP: fully automated protein–ligand interaction profiler

PLIC: protein–ligand interaction clusters

Multi-PLI: interpretable multi‐task deep learning model for unifying protein–ligand interaction datasets

A comprehensive dataset of protein-protein interactions and ligand binding pockets for advancing drug discovery

PLIP 2021: expanding the scope of the protein–ligand interaction profiler to DNA and RNA

Does protein pretrained language model facilitate the prediction of protein–ligand interaction?

ML-PLIC: a web platform for characterizing protein–ligand interactions and developing machine learning-based scoring functions

G- PLIP: Knowledge graph neural network for structure-free protein-ligand bioactivity prediction

DeepRLI: A Multi-objective Framework for Universal Protein--Ligand Interaction Prediction

Benchmark Study Based on 2P2I(DB) to Gain Insights into the Discovery of Small-Molecule PPI Inhibitors

Natural Language Processing Methods for the Study of Protein-Ligand Interactions

Protein-protein interface analysis and hot spots identification for chemical ligand design.

G- : Knowledge graph neural network for structure-free protein-ligand bioactivity prediction

Deep Learning for Protein-Ligand Docking: Are We There Yet?

Robust Protein-Ligand Interaction Modeling Integrating Physical Laws and Geometric Knowledge for Absolute Binding Free Energy Calculation

Leak Proof PDBBind: A Reorganized Dataset of Protein-Ligand Complexes for More Generalizable Binding Affinity Prediction

SMPLIP-Score: predicting ligand binding affinity from simple and interpretable on-the-fly interaction fingerprint pattern descriptors