Prediction of protein–protein interactions based on elastic net and deep forest
Bin Yu,Cheng Chen,Xiaolin Wang,Zhaomin Yu,Anjun Ma,Bingqiang Liu
DOI: https://doi.org/10.1016/j.eswa.2021.114876
IF: 8.5
2021-08-01
Expert Systems with Applications
Abstract:<p>Prediction of protein-protein interactions (PPIs) helps to grasp molecular roots of disease. However, web-lab experiments to predict PPIs are limited and costly. Using machine-learning-based frameworks can not only automatically identify PPIs, but also provide new ideas for drug research and development from a promising alternative. We present a novel deep-forest-based method for PPIs prediction. Firstly, pseudo amino acid composition (PAAC), autocorrelation descriptor (Auto), multivariate mutual information (MMI), composition-transition-distribution (CTD), amino acid composition position-specific scoring matrix (AAC-PSSM), and dipeptide composition PSSM (DPC-PSSM) are adopted to extract and construct the pattern of PPIs. Secondly, elastic net is utilized to optimize the initial feature vectors and boost the predictive performance. Finally, we ensemble XGBoost, random forest, and extremely randomized trees to construct deep forest model via cascade architecture for PPIs prediction (GcForest-PPI). Benchmark experiments reveal that the proposed approach outperforms other state-of-the-art predictors on <em>Saccharomyces cerevisiae</em> and <em>Helicobacter pylori</em>. We also apply GcForest-PPI on independent test sets, CD9-core network, crossover network, and cancer-specific network. The evaluation shows that GcForest-PPI can boost the prediction accuracy, complement experiments and improve drug discovery.</p>
computer science, artificial intelligence,engineering, electrical & electronic,operations research & management science