DeepLoc 2.1: multi-label membrane protein type prediction using protein language models

Marius Thrane Ødum,Felix Teufel,Vineet Thumuluri,José Juan Almagro Armenteros,Alexander Rosenberg Johansen,Ole Winther,Henrik Nielsen
DOI: https://doi.org/10.1093/nar/gkae237
2024-07-05
Abstract:DeepLoc 2.0 is a popular web server for the prediction of protein subcellular localization and sorting signals. Here, we introduce DeepLoc 2.1, which additionally classifies the input proteins into the membrane protein types Transmembrane, Peripheral, Lipid-anchored and Soluble. Leveraging pre-trained transformer-based protein language models, the server utilizes a three-stage architecture for sequence-based, multi-label predictions. Comparative evaluations with other established tools on a test set of 4933 eukaryotic protein sequences, constructed following stringent homology partitioning, demonstrate state-of-the-art performance. Notably, DeepLoc 2.1 outperforms existing models, with the larger ProtT5 model exhibiting a marginal advantage over the ESM-1B model. The web server is available at https://services.healthtech.dtu.dk/services/DeepLoc-2.1.
What problem does this paper attempt to address?