New Stopping Criteria for Segmenting DNA Sequences

Wentian Li
DOI: https://doi.org/10.1103/physrevlett.86.5815
IF: 8.6
2001-06-18
Physical Review Letters
Abstract:We propose a solution on the stopping criterion in segmenting inhomogeneous DNA sequences with complex statistical patterns. This new stopping criterion is based on Bayesian information criterion in the model selection framework. When this criterion is applied to telomere of S. cerevisiae and the complete sequence of E. coli, borders of biologically meaningful units were identified, and a more reasonable number of domains was obtained. We also introduce a measure called segmentation strength which can be used to control the delineation of large domains. The relationship between the average domain size and the threshold of segmentation strength is determined for several genome sequences.
physics, multidisciplinary
What problem does this paper attempt to address?