A predictive approach for host-pathogen interactions using deep learning and protein sequences

Taha Shakibania,Masoud Arabfard,Ali Najafi
DOI: https://doi.org/10.1007/s13337-024-00882-x
VirusDisease
Abstract:Research on host-pathogen interactions (HPIs) has evolved rapidly during the past decades. The more humans discover new pathogens, the more challenging it gets to find a cure and prevent infections caused by those pathogens. Many experimental techniques have been proposed to predict the interactions but most of them are highly costly and time-consuming. Fortunately, computational methods have been proven to be efficient in overcoming such limitations. In this study, we propose utilizing Deep Learning methods to predict HPIs using protein sequences. We use the monoMonoKGap (mMKGap) algorithm with K = 2 to extract features from the sequences. We also used the Negatome Database to generate negative interactions. The proposed method was performed on three separate balanced human-pathogen datasets with 10-fold cross-validation. Our method yielded very high accuracies of 99.65%, 99.52%, and 99.66% (mean accuracy of 99.61%). To further evaluate the performance of the deep Network, we compared it with other classification methods, which were the Random Forest (RF) as multiple Decision Tree, the Support Vector Machine (SVM), and Convolutional Neural Network (CNN). We also tested the Dipeptide Composition algorithm as another feature extraction method to compare the results with the mMKGap method. The experimental results prove that the proposed method is very accurate, robust, and practical and could be used as a reliable framework in HPI research.
What problem does this paper attempt to address?