Speeding Up Double-Array Trie Construction for String Matching

Niu Shuai,Liu Yanbing,Song Xinbo
DOI: https://doi.org/10.1007/978-3-642-35795-4_72
2013-01-01
Abstract:Double-Array Trie is presented as a data structure for Trie which has advantages both in the compactness and access speed. Thus Double Array Trie structure is broadly adopted by many string matching algorithms. However, the Double Array Trie construction process is faced with problems of huge temporary peak of memory consumption and low construction speed when applied to large scale sets of strings. It’s hard to meet the requirement of detecting high speed network flow in real time. This paper presents two optimization strategies in the Double Array Trie construction process to avoid the temporary peak of memory consumption and reduce the construction time. The first is to generate the Trie recursively. The second is to take different methods in finding current node’s base value process according to the number of child nodes. We applied the improved strategy to Aho-Corasick algorithm and tested with different large-scale sets of strings. From the results, it turned out that the space consumption and the construction time are both significantly improved on the premise of same search efficiency.
What problem does this paper attempt to address?