Abstract:Background: Identifying protein-protein interactions (PPIs) is essential for elucidating protein functions and understanding the molecular mechanisms inside the cell. However, the experimental methods for detecting PPIs are both time-consuming and expensive. Therefore, computational prediction of protein interactions are becoming increasingly popular, which can provide an inexpensive way of predicting the most likely set of interactions at the entire proteome scale, and can be used to complement experimental approaches. Although much progress has already been achieved in this direction, the problem is still far from being solved and new approaches are still required to overcome the limitations of the current prediction models.Results: In this work, a sequence-based approach is developed by combining a novel Multi-scale Continuous and Discontinuous (MCD) feature representation and Support Vector Machine (SVM). The MCD representation gives adequate consideration to the interactions between sequentially distant but spatially close amino acid residues, thus it can sufficiently capture multiple overlapping continuous and discontinuous binding patterns within a protein sequence. An effective feature selection method mRMR was employed to construct an optimized and more discriminative feature set by excluding redundant features. Finally, a prediction model is trained and tested based on SVM algorithm to predict the interaction probability of protein pairs.Conclusions: When performed on the yeast PPIs data set, the proposed approach achieved 91.36% prediction accuracy with 91.94% precision at the sensitivity of 90.67%. Extensive experiments are conducted to compare our method with the existing sequence-based method. Experimental results show that the performance of our predictor is better than several other state-of-the-art predictors, whose average prediction accuracy is 84.91%, sensitivity is 83.24%, and precision is 86.12%. Achieved results show that the proposed approach is very promising for predicting PPI, so it can be a useful supplementary tool for future proteomics studies. The source code and the datasets are freely available at http:/?csse.szu.edu.cn?staff?youzh?MCDPPI.zip for academic use.

Prediction of Protein–Protein Interactions with Clustered Amino Acids and Weighted Sparse Representation

Improved Protein-Protein Interactions Prediction Via Weighted Sparse Representation Model Combining Continuous Wavelet Descriptor and PseAA Composition

Using Weighted Sparse Representation Model Combined with Discrete Cosine Transformation to Predict Protein-Protein Interactions from Protein Sequence.

Prediction of Protein-Protein Interactions from Protein Sequences by Combining MatPCA Feature Extraction Algorithms and Weighted Sparse Representation Models

Prediction of Protein-Protein Interactions from Amino Acid Sequences Based on Continuous and Discrete Wavelet Transform Features.

Sequence-based Prediction of Protein-Protein Interactions Using Weighted Sparse Representation Model Combined with Global Encoding

Construction of Reliable Protein-Protein Interaction Networks Using Weighted Sparse Representation Based Classifier with Pseudo Substitution Matrix Representation Features

Using Chou's amphiphilic Pseudo-Amino Acid Composition and Extreme Learning Machine for prediction of Protein-protein interactions

An Improved Sequence-Based Prediction Protocol for Protein-Protein Interactions Using Amino Acids Substitution Matrix and Rotation Forest Ensemble Classifiers.

Predicting Protein Interactions Using a Deep Learning Method-Stacked Sparse Autoencoder Combined with a Probabilistic Classification Vector Machine.

A SVM-based system for predicting protein-protein interactions using a novel representation of protein sequences

A novel method to predict protein-protein interactions based on the information of protein sequence

Improved Prediction Of Protein-Protein Interactions Using Descriptors Derived From Pssm Via Gray Level Co-Occurrence Matrix

Advancing the prediction accuracy of protein-protein interactions by utilizing evolutionary information from position-specific scoring matrix and ensemble classifier

Predicting Protein-Protein Interactions from Primary Protein Sequences Using a Novel Multi-Scale Local Feature Representation Scheme and the Random Forest

Prediction of Protein-Protein Interactions from Amino Acid Sequences Using A Novel Multi-Scale Continuous and Discontinuous Feature Set

Predicting Protein-Protein Interactions from Protein Sequences by a Stacked Sparse Autoencoder Deep Neural Network.

Ens-PPI: A Novel Ensemble Classifier for Predicting the Interactions of Proteins Using Autocovariance Transformation from PSSM

FCTP-WSRC: Protein-Protein Interactions Prediction <i>via</i> Weighted Sparse Representation Based Classification

An Efficient Ensemble Learning Approach for Predicting Protein-Protein Interactions by Integrating Protein Primary Sequence and Evolutionary Information.

Protein-Protein Interactions Prediction Using A Novel Local Conjoint Triad Descriptor Of Amino Acid Sequences