The Preferential Mode Analysis of DNA Sequence.

LF Luo,FM Ji
DOI: https://doi.org/10.1006/jtbi.1997.0485
IF: 2.405
1997-01-01
Journal of Theoretical Biology
Abstract:After reviewing approaches to the nucleotide correlation of DNA sequences the preferential mode analysis method is emphasized and discussed in detail. The preferred modes and poor modes in coding regions, as well as in introns, 5'-caps and 3'-tails are found through the statistical analysis of sequence data of all kinds of species in GenBank. The relation between the preferential mode analysis and informational parameter method is deduced. It is discovered that in higher species the coding sequences preferentially use the strong-weak bond (strong bond=C,G; weak bond=A, T) language and many noncoding regions (introns, 5'-caps, 3'-tails) use purine-pyrimidine language. The application of different languages in coding and noncoding sequences is a result of evolution, and it may be related to the functional differences in these two regions. Furthermore, we find that many preferential triplets in coding sequences can be expressed in a form of (* W S) (W=A,T; S=C,G), which may be explained by its relation to t-RNA abundance. The systematic change of some mode contents with evolution has also been found.
What problem does this paper attempt to address?