Trie-Based Data Structures For Sequence Assembly

T Chen,Ss Skiena
DOI: https://doi.org/10.1007/3-540-63220-4_61
1997-01-01
Abstract:We investigate the application of trie-based data structures, suffix trees and suffix arrays in the problem of overlap detection in fragment assembly. Both data structures are theoretically and experimentally analyzed on speed and space. By using heuristics, we can greatly reduce the calls to the time-consuming dynamic programming, and have improved the speed of overlap detection up to 1,000 times with high accuracy in our collaborative DNA sequencing with Brookhaven National Laboratory. We also studied the problem of approximating maximum space savings in tries structures for unification factoring in logic programming, which is proved to be hard.
What problem does this paper attempt to address?