CW-PRED: Prediction of C-terminal surface anchoring sorting signals in bacteria and Archaea

Aikaterini G. Chatziargyri,Evangelia A. Stasi,Konstantinos I. Tsirigos,Zoi I. Litou,Vassiliki A. Iconomidou,Pantelis G. Bagos
DOI: https://doi.org/10.1142/s0219720024500215
2024-09-03
Journal of Bioinformatics and Computational Biology
Abstract:Journal of Bioinformatics and Computational Biology, Ahead of Print. Sorting signals are crucial for the anchoring of proteins to the cell surface in archaea and bacteria. These proteins often feature distinct motifs at their C-terminus, cleaved by sortase or sortase-like enzymes. Gram-positive bacteria exhibit the LPXTGX consensus motif, cleaved by sortases, while Gram-negative bacteria employ exosortases recognizing motifs like PEP. Archaea utilize exosortase homologs known as archaeosortases for signal anchoring. Traditionally identification of such C-terminal sorting signals was performed with profile Hidden Markov Models (pHMMs). The Cell- Wall PREDiction (CW-PRED) method introduced for the first time a custom-made class HMM for proteins in Gram-positive bacteria that contain a cell wall sorting signal which begins with an LPXTG motif, followed by a hydrophobic domain and a tail of positively charged residues. Here we present a new and updated version of CW-PRED for predicting C-terminal sorting signals in Archaea, Gram-positive, and Gram-negative bacteria. We used a large training set and several model enhancements that improve motif identification in order to achieve better discrimination between C-terminal signals and other proteins. Cross-validation demonstrates CW-PRED's superiority in sensitivity and specificity compared to other methods. Application of the method in reference proteomes reveals a large number of potential surface proteins not previously identified. The method is available for academic use at http://195.251.108.230/apps.compgen.org/CW-PRED/ and as standalone software.
mathematical & computational biology
What problem does this paper attempt to address?