PGraphDTA: Improving Drug Target Interaction Prediction using Protein Language Models and Contact Maps

Rakesh Bal,Yijia Xiao,Wei Wang
2024-02-11
Abstract:Developing and discovering new drugs is a complex and resource-intensive endeavor that often involves substantial costs, time investment, and safety concerns. A key aspect of drug discovery involves identifying novel drug-target (DT) interactions. Existing computational methods for predicting DT interactions have primarily focused on binary classification tasks, aiming to determine whether a DT pair interacts or not. However, protein-ligand interactions exhibit a continuum of binding strengths, known as binding affinity, presenting a persistent challenge for accurate prediction. In this study, we investigate various techniques employed in Drug Target Interaction (DTI) prediction and propose novel enhancements to enhance their performance. Our approaches include the integration of Protein Language Models (PLMs) and the incorporation of Contact Map information as an inductive bias within current models. Through extensive experimentation, we demonstrate that our proposed approaches outperform the baseline models considered in this study, presenting a compelling case for further development in this direction. We anticipate that the insights gained from this work will significantly narrow the search space for potential drugs targeting specific proteins, thereby accelerating drug discovery. Code and data for PGraphDTA are available at <a class="link-external link-https" href="https://github.com/Yijia-Xiao/PgraphDTA/" rel="external noopener nofollow">this https URL</a>.
Machine Learning,Quantitative Methods
What problem does this paper attempt to address?
The paper aims to address the problem of Drug-Target Interaction (DTI) prediction. Specifically, the paper focuses on the following points: 1. **Improving existing methods**: Existing methods for predicting DTI mainly focus on binary classification tasks, i.e., determining whether a drug-target pair interacts. However, protein-ligand interactions actually have continuous binding strengths (binding affinities), which pose a challenge for accurate prediction. 2. **Utilizing Protein Language Models (PLMs)**: The paper proposes integrating PLMs into existing DTI prediction models to improve performance. Experiments show that PLMs can better represent amino acid sequences, thereby enhancing the prediction of binding affinities. 3. **Introducing contact map information**: The model incorporates contact map information as an inductive bias to enhance performance on small datasets. Contact maps provide information about the distances between atoms within a protein, helping the model better understand protein structures and their interactions with drugs. Through these improvements, the paper demonstrates that its proposed PGraphDTA model outperforms the baseline model GraphDTA on the DAVIS and KIBA benchmark datasets. Additionally, the study finds that incorporating contact map information significantly enhances model performance on smaller datasets. These findings provide new insights and technical means to further accelerate the drug discovery process.