Abstract:BackgroundDrug-target interaction (DTI) is a vital drug design strategy that plays a significant role in many processes of complex diseases and cellular events. In the face of challenges such as extensive protein data and experimental costs, it is suggested to apply bioinformatics approaches to exploit potential interactions to design new targeted medications. Different data and interaction types bring difficulties to study involving incompatible and heterology formats. The analysis of drug-target interactions in a comprehensive and unified model is a significant challenge.MethodHere, we propose a general method for predicting interactions between small-molecule drugs and protein targets, Large-scale Drug target Screening Convolutional Neural Network (LDS-CNN), which used unified encoding to achieve the calculation of the different data formats in an integrated model to realize feature abstraction and potential object prediction.ResultOn 898,412 interaction data involving 1683 small-molecule compounds and 14,350 human proteins from 8.8 billion records, the proposed method achieved an area under the curve (AUC) of 0.96, an area under the precision-recall curve (AUPRC) of 0.95, and an accuracy of 90.13%. The experimental results illustrated that the proposed method attained high accuracy on the test set, indicating its high predictive ability in drug-target interaction prediction. LDS-CNN is effective for the prediction of large-scale datasets and datasets composed of data with different formats.ConclusionIn this study, we propose a DTI prediction method to solve the problems of unified encoding of large-scale data in multiple formats. It provides a feasible way to efficiently abstract the features among different types of drug-related data, thus reducing experimental costs and time consumption. The proposed method can be used to identify potential drug targets and candidates for the treatment of complex diseases. This work provides a reference for DTI to process large-scale data and different formats with deep learning methods and provides certain suggestions for future research.

A large dataset curation and benchmark for drug target interaction

A Dataset of Discovering Drug-Target Interaction from Biomedical Literature

Making Sense of Large-Scale Kinase Inhibitor Bioactivity Data Sets: A Comparative and Integrative Analysis

DRUG-TARGET INTERACTION PREDICTION BY INTEGRATING CHEMICAL, GENOMIC, FUNCTIONAL AND PHARMACOLOGICAL DATA

Towards a more inductive world for drug repurposing approaches

SIU: A Million-Scale Structural Small Molecule-Protein Interaction Dataset for Unbiased Bioactivity Prediction

Deep drug-target binding affinity prediction with multiple attention blocks

Research progress on Drug-Target Interactions in the last five years

DataDTA: a multi-feature and dual-interaction aggregation framework for drug–target binding affinity prediction

LDS-CNN: a deep learning framework for drug-target interactions prediction based on large-scale drug screening

A deep learning method for drug-target affinity prediction based on sequence interaction information mining

Attention-based approach to predict drug-target interactions across seven target superfamilies

Benchmark on Drug Target Interaction Modeling from a Structure Perspective

MolData, a molecular benchmark for disease and target based machine learning

Benchmarking compound activity prediction for real-world drug discovery applications

CoaDTI: multi-modal co-attention based framework for drug-target interaction annotation

Prediction of drug-target interactions for drug repositioning only based on genomic expression similarity

A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information

MDTips: A Multimodal-data based Drug-Target interaction prediction system fusing knowledge, gene expression profile and structural data

DrugOOD: Out-of-Distribution Dataset Curator and Benchmark for AI-Aided Drug Discovery - a Focus on Affinity Prediction Problems with Noise Annotations.

A Biological Feature and Heterogeneous Network Representation Learning-Based Framework for Drug–Target Interaction Prediction