Analysis of DNA sequences using methods of statistical physics

SV Buldyrev, NV Dokholyan, AL Goldberger, S Havlin, C-K Peng, HE Stanley, GM Viswanathan
1998-01-02
Abstract:We review the present status of the studies of DNA sequences using methods of statistical physics. We present evidence, based on systematic studies of the entire GenBank database, supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range, i.e., base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the DNA. We discuss the mechanisms of molecular evolution that may lead to the presence of long-range power-law correlations in noncoding DNA and their absence in coding DNA. One such mechanism is the simple repeat expansion, which recently has attracted the attention of the biological community in conjunction with genetic diseases. We also review new tools – e.g., detrended fluctuation analysis – that are useful for studies of complex hierarchical …
What problem does this paper attempt to address?