Finding repetitions in DNA sequences based on a new index-succeeding unit array

Di Wang,Baichen Chen,Qingquan Wu,Yi Zhao,Changyong Yu,Guoren Wang
2006-01-01
Journal of Computational Information Systems
Abstract:Since the repetitions in a DNA sequence are of great biological significance searching for the repetitions has naturally been an important topic in gene analysis. This paper proposes two new concepts of repetitions-LPR for perfect repetitions and TSAR for approximate repetitions. A lightweight index structure, namely, the Succeeding Unit Array (SUA) is designed based on pattern unit. The SUA decreases the space consumption efficiently and solves the space bottleneck in search of repetitions. On the SUA all the LPRs and TSARs can be detected. The theoretical analysis and experimental results show that both space and time complexity of the algorithms is satisfying.
What problem does this paper attempt to address?