Design of Fast Multiple String Searching Based on Improved Prefix Tree

Yu Cheng,Tao Zhang
DOI: https://doi.org/10.1109/WKDD.2010.138
2010-01-01
Abstract:Multi-string matching is one of the most important components in data mining task. New applications in many technology fields require high performance string matching algorithms. This paper first presents a new string searching approach based on a data structure called prefix tree. The innovative algorithm eliminates the functional overlap of the table HASH and Prefix Function. Then we make a little improvement on the prefix tree and present a second algorithm that is faster and more space-saving. It is demonstrated analytically that the two algorithms inherit the optimality and are very competitive in practice. On tests of both real life and synthetic data, our algorithms are also efficient and especially effective for various string pattern and large alphabet sets.
What problem does this paper attempt to address?