Exploring proteome-wide occurrence of clusters of charged residues in eukaryotes

Sabrine Belmabrouk,Najla Kharrat,Riadh Benmarzoug,Ahmed Rebai
DOI: https://doi.org/10.1002/prot.24823
Abstract:Clusters of charged residues are one of the key features of protein primary structure since they have been associated to important functions of proteins. Here, we present a proteome wide scan for the occurrence of Charge Clusters in Protein sequences using a new search tool (FCCP) based on a score-based methodology. The FCCP was run to search charge clusters in seven eukaryotic proteomes: Arabidopsis thaliana, Caenorhabditis elegans, Danio rerio, Drosophila melanogaster, Homo sapiens, Mus musculus, and Saccharomyces cerevisiae. We found that negative charge clusters (NCCs) are three to four times more frequent than positive charge clusters (PCCs). The Drosophila proteome is on average the most charged, whereas the human proteome is the least charged. Only 3 to 8% of the studied protein sequences have negative charge clusters, while 1.6 to 3% having PCCs and only 0.07 to 0.6% have both types of clusters. NCCs are localized predominantly in the N-terminal and C-terminal domains, while PCCs tend to be localized within the functional domains of the protein sequences. Furthermore, the gene ontology classification revealed that the protein sequences with negative and PCCs are mainly binding proteins.
What problem does this paper attempt to address?