Prediction of Protein–Protein Interactions Based on Integrating Deep Learning and Feature Fusion

Hoai-Nhan Tran,Phuc-Xuan-Quynh Nguyen,Fei Guo,Jianxin Wang
DOI: https://doi.org/10.3390/ijms25115820
IF: 5.6
2024-05-28
International Journal of Molecular Sciences
Abstract:Understanding protein–protein interactions (PPIs) helps to identify protein functions and develop other important applications such as drug preparation and protein–disease relationship identification. Deep-learning-based approaches are being intensely researched for PPI determination to reduce the cost and time of previous testing methods. In this work, we integrate deep learning with feature fusion, harnessing the strengths of both approaches, handcrafted features, and protein sequence embedding. The accuracies of the proposed model using five-fold cross-validation on Yeast core and Human datasets are 96.34% and 99.30%, respectively. In the task of predicting interactions in important PPI networks, our model correctly predicted all interactions in one-core, Wnt-related, and cancer-specific networks. The experimental results on cross-species datasets, including Caenorhabditis elegans, Helicobacter pylori, Homo sapiens, Mus musculus, and Escherichia coli, also show that our feature fusion method helps increase the generalization capability of the PPI prediction model.
biochemistry & molecular biology,chemistry, multidisciplinary
What problem does this paper attempt to address?
The main objective of this paper is to propose a new protein-protein interaction (PPI) prediction model that combines deep learning techniques and feature fusion methods to improve prediction accuracy. Specifically, the paper aims to address the following issues: 1. **Improving PPI prediction accuracy**: Existing PPI prediction methods have certain limitations in terms of accuracy and efficiency, especially since experimental biological methods are costly and time-consuming. Therefore, the researchers aim to develop a new computational method to enhance the accuracy of PPI predictions. 2. **Reducing the effort of manual feature engineering**: Traditional machine learning-based methods require a significant amount of manual feature extraction work, which is not only time-consuming and labor-intensive but may also not be the optimal choice. By leveraging deep learning to automatically learn feature representations, this burden can be alleviated. 3. **Enhancing the model's generalization ability**: To ensure that the model performs well on datasets from different species, the researchers are committed to developing a model that can extract useful information from various data sources, thereby enhancing its generalization ability. 4. **Optimizing feature fusion techniques**: By combining multiple types of features (such as handcrafted features and protein sequence embeddings), the researchers hope to construct a more robust feature representation, thereby improving PPI prediction performance. In summary, the main goal of this paper is to develop a new model named DF-PPI, which integrates deep learning and feature fusion techniques to more accurately and efficiently predict protein-protein interactions, with strong generalization capabilities.