Predicting Drug–Target Interactions Based on the Ensemble Models of Multiple Feature Pairs

Cheng Wang,Jun Zhang,Peng Chen,Bing Wang

DOI: https://doi.org/10.3390/ijms22126598

IF: 5.6

2021-06-20

International Journal of Molecular Sciences

Abstract:Backgroud: The prediction of drug–target interactions (DTIs) is of great significance in drug development. It is time-consuming and expensive in traditional experimental methods. Machine learning can reduce the cost of prediction and is limited by the characteristics of imbalanced datasets and problems of essential feature selection. Methods: The prediction method based on the Ensemble model of Multiple Feature Pairs (Ensemble-MFP) is introduced. Firstly, three negative sets are generated according to the Euclidean distance of three feature pairs. Then, the negative samples of the validation set/test set are randomly selected from the union set of the three negative sets in the validation set/test set. At the same time, the ensemble model with weight is optimized and applied to the test set. Results: The area under the receiver operating characteristic curve (area under ROC, AUC) in three out of four sub-datasets in gold standard datasets was more than 94.0% in the prediction of new drugs. The effectiveness of the proposed method is also shown with the comparison of state-of-the-art methods and demonstration of predicted drug–target pairs. Conclusion: The Ensemble-MFP can weigh the existing feature pairs and has a good prediction effect for general prediction on new drugs.

biochemistry & molecular biology,chemistry, multidisciplinary

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to predict drug - target interactions (DTIs) in drug development. Traditional experimental methods are time - consuming and costly, while machine - learning methods can reduce the prediction cost but are limited by the problem of unbalanced data sets and the difficulty of key feature selection. Specifically, the paper aims to improve the prediction effect of new drugs through an ensemble model based on multiple feature pairs (Ensemble - MFP), while solving the problems of negative sample generation and feature - pair selection. The methods proposed in the paper mainly include the following aspects: 1. **Negative sample generation**: Generate three negative sample sets according to the Euclidean distances of three feature pairs, and randomly select negative samples for the validation set and the test set from them to improve the reliability of negative samples. 2. **Ensemble model construction**: Train three sub - models using three different feature pairs, and combine these sub - models by optimizing the weights to form the final ensemble model. 3. **Data set partitioning**: Use 5 - fold cross - validation to divide drugs proportionally into training sets, validation sets and test sets to ensure the generalization ability of the model for new drugs. The main contributions of the paper are: - Proposing an ensemble model method based on multiple feature pairs, which effectively solves the problem of unbalanced data sets. - Improving the prediction effect by optimizing the model weights, especially performing well in the prediction of new drugs. - Compared with existing methods, this method has superior performance on multiple benchmark data sets, especially on GPCR and ion channel data sets. In general, this paper provides a new and effective solution for the prediction of drug - target interactions, which helps to accelerate the drug development process.

Predicting Drug–Target Interactions Based on the Ensemble Models of Multiple Feature Pairs

[Screening for malignant gynecologic tumors].

Predicting Drug-Target Interactions with Electrotopological State Fingerprints and Amphiphilic Pseudo Amino Acid Composition

DrugECs: An Ensemble System with Feature Subspaces for Accurate Drug-Target Interaction Prediction

Identifying potential drug-target interactions based on ensemble deep learning

Drug-target interaction prediction via class imbalance-aware ensemble learning

A deep learning-based method for drug-target interaction prediction based on long short-term memory neural network

DeepStack-DTIs: Predicting Drug–Target Interactions Using LightGBM Feature Selection and Deep-Stacked Ensemble Classifier

DeepFusionDTA: drug-target binding affinity prediction with information fusion and hybrid deep-learning ensemble model

DrugE-Rank: improving drug-target interaction prediction of new candidate drugs or targets by ensemble learning to rank.

Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition

Prediction of Drug-Target Interactions with High-Quality Negative Samples and A Network-Based Deep Learning Framework

A Biological Feature and Heterogeneous Network Representation Learning-Based Framework for Drug–Target Interaction Prediction

HEnsem_DTIs: A heterogeneous ensemble learning model for drug-target interactions prediction

MFFDTA: A Multimodal Feature Fusion Framework for Drug-Target Affinity Prediction

Versatile Framework for Drug-Target Interaction Prediction by Considering Domain-Specific Features

Optimizing Area Under the Curve Measures via Matrix Factorization for Predicting Drug-Target Interaction with Multiple Similarities

A Machine Learning Method for Drug Combination Prediction

Predicting drug-target interaction based on sequence and structure information

Prediction of drug-target interactions via neural tangent kernel extraction feature matrix factorization model