Predicting the subcellular location of prokaryotic proteins with DeepLocPro

Jaime Moreno,Henrik Nielsen,Ole Winther,Felix Teufel
DOI: https://doi.org/10.1101/2024.01.04.574157
2024-01-04
Abstract:Protein subcellular location prediction is a widely explored task in bioinformatics because of its importance in proteomics research. We propose DeepLocPro, an extension to the popular method DeepLoc, tailored specifically to archaeal and bacterial organisms. DeepLocPro is a multiclass subcellular location prediction tool for prokaryotic proteins, trained on experimentally verified data curated from UniProt and PSORTdb. DeepLocPro compares favorably to the PSORTb 3.0 ensemble method, surpassing its performance across multiple metrics on our benchmark experiment. The DeepLocPro prediction tool is available online at and .
Bioinformatics
What problem does this paper attempt to address?
The main objective of this paper is to develop a new method called DeepLocPro for predicting the subcellular localization of prokaryotic proteins. Specifically, DeepLocPro is a deep learning-based approach designed for archaeal and bacterial proteins, capable of predicting six major subcellular localizations: cell wall and surface, extracellular space, cytoplasm, cytoplasmic membrane, outer membrane, and periplasm. To achieve this goal, the researchers utilized experimentally validated data collected from the UniProt and PSORTdb databases and employed a pre-trained protein language model (pLM) called ESM-2 to extract features from protein sequences. The model was evaluated using 5-fold cross-validation and its performance was compared with the existing prokaryotic subcellular localization prediction tool PSORTb 3.0. Overall, DeepLocPro demonstrated superior performance in terms of prediction accuracy, macro-averaged F1 score, and Matthews correlation coefficient, particularly outperforming PSORTb 3.0 in the prediction of bacterial proteins. Despite challenges in certain categories (such as archaeal cell wall proteins) due to limited training data, DeepLocPro still exhibited strong predictive capabilities and is available as an online tool for users.