Zipf's Law Probably Existing in Protein Sequences

Yu-jian LI,Chuang-bai XIAO
DOI: https://doi.org/10.3969/j.issn.0254-0037.2005.04.007
2005-01-01
Abstract:In order to analyze whether Zipf's law in linguistics exists in protein sequences, this paper uses 1.735 7 × 104 protein sequences labeled with secondary structures which are selected from the DSSP database. The segments of successive amino acid residues with a same code of secondary structure are defined as words. The results show that the distribution of word emerging frequency follows Zipf' s law with the exponent as 0.981.
What problem does this paper attempt to address?