DrugMGR: a deep bioactive molecule binding method to identify compounds targeting proteins

Xiaokun Li,Qiang Yang,Long Xu,Weihe Dong,Gongning Luo,Wei Wang,Suyu Dong,Kuanquan Wang,Ping Xuan,Xianyu Zhang,Xin Gao
DOI: https://doi.org/10.1093/bioinformatics/btae176
IF: 5.8
2024-03-29
Bioinformatics
Abstract:Abstract Motivation Understanding the intermolecular interactions of ligand–target pairs is key to guiding the optimization of drug research on cancers, which can greatly mitigate overburden workloads for wet labs. Several improved computational methods have been introduced and exhibit promising performance for these identification tasks, but some pitfalls restrict their practical applications: (i) first, existing methods do not sufficiently consider how multigranular molecule representations influence interaction patterns between proteins and compounds; and (ii) second, existing methods seldom explicitly model the binding sites when an interaction occurs to enable better prediction and interpretation, which may lead to unexpected obstacles to biological researchers. Results To address these issues, we here present DrugMGR, a deep multigranular drug representation model capable of predicting binding affinities and regions for each ligand–target pair. We conduct consistent experiments on three benchmark datasets using existing methods and introduce a new specific dataset to better validate the prediction of binding sites. For practical application, target-specific compound identification tasks are also carried out to validate the capability of real-world compound screen. Moreover, the visualization of some practical interaction scenarios provides interpretable insights from the results of the predictions. The proposed DrugMGR achieves excellent overall performance in these datasets, exhibiting its advantages and merits against state-of-the-art methods. Thus, the downstream task of DrugMGR can be fine-tuned for identifying the potential compounds that target proteins for clinical treatment. Availability and implementation https://github.com/lixiaokun2020/DrugMGR.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to address two main issues in existing methods for binding bioactive molecules: 1. **Impact of Multi-Granularity Molecular Representation**: - Existing methods fail to adequately consider the impact of multi-granularity molecular representation on interaction patterns when modeling the interaction between proteins and compounds. Specifically, the binding affinity of protein-ligand complexes is inherently determined by natural mechanisms such as atomic environment, chemogenomic sequences, and interaction effects. However, many previous methods represent features through separate encoders without integrating multi-granularity information. This may lead to difficulties in interpreting how real interaction patterns affect protein-ligand complexes. 2. **Interpretability of Binding Sites**: - Existing methods rarely explicitly model binding sites when constructing models, which may cause unexpected obstacles for biological researchers in practical applications. Although some methods infer binding sites through attention mechanisms, these methods often fail to associate high-response regions with the corresponding biological characteristics of the target. A general model should be able to highlight binding sites with high confidence, thereby guiding researchers in locating binding sites. To mitigate these issues, the authors propose **DrugMGR**, a deep multi-granularity representation-based method that can predict the binding affinity and binding regions of ligands with protein targets. By comprehensively encoding the natural mechanisms of ligands (including atomic environment, chemogenomic sequences, and contextual effects of local substructures) as well as the advanced features of proteins, DrugMGR performs excellently on multiple benchmark datasets and can provide interpretable binding region predictions.