A fast string matching algorithm for large-scale pattern sets
Wei Zhang,Yibo Xue,Zongwei Zhou,Dongsheng Wang
DOI: https://doi.org/10.3772/j.issn.1002-0470.2009.06.001
2009-01-01
Abstract:In view of the problem that the performance of the classical pattern matching (one of the key technologies for network and information security systems) algorithms degrades seriously when the patterns become large, especially over 50000, this paper proposes a new architectural large-scale pattern matching algorithm (ALPM) for large-scale pattern sets. Based on the shift concept of the classical Wu-Manber (WM) algorithm and combined with its features of hardware architecture, the ALPM adopts several pre-processing and matching strategies, such as utilizing two different Hash functions to access the Shift and hash tables, optimizing pre-processing to choose the best entry signs from patterns for the two tables and adjusting the Hash confliction dynamically with the Cache size and the pattern quantity, to improve the matching performance. The experimental results show that for the large-scale pattern set, the matching performance of the ALPM is 5-10 times higher than that of the classical WM.