Discussion on the Universality of Information Content in DNA Sequences

LUO Liao-fu
DOI: https://doi.org/10.3969/j.issn.1673-162x.2005.01.001
2005-01-01
Abstract:To describe the character of the distribution of k-tuple frequency in genomes, four kinds of information quantity are defined.By the statistical analysis of DNA sequences of 16 typical genomes the universal relation between k-tuple information-entropy and word length k is deduced.It is suggested that the universality is related to the neutral mutation-random drift of molecular evolution.The conserved over- or under-represented oligo-nucleotide fragment is defined as a specific word.The implication of these specific words in the information network of DNA-, RNA- and protein-interaction is discussed briefly.
What problem does this paper attempt to address?