Comparison Of Stringmatching Algorithms: An Aid To Information Content Security
An Du,Bx Fang,Xc Yun,Mz Hu,Xr Zheng
DOI: https://doi.org/10.1109/ICMLC.2003.1260090
2003-01-01
Abstract:We analyzed the core ideas of three basic stringmatching algorithms (KMP, BM, DFA), described the principles of five advanced on-line multi-pattern matching algorithms (AC, RAC, AQR, SBOM, Mgrep) and compared the matching efficiencies of the five algorithms by searching speed, preprocessing time and memory used on three web information string sets (Chinese phases, URL strings, Email address strings), especially focusing on the infection of pattern set size and min pattern length on the efficiency. From the comparison, we find that stringmatching on Chinese text and URL strings, AQR algorithm is rather efficient; while on Email address matching, SBOM does better. The skipping matching algorithms (such as Mgrep) are much more efficient for small pattern sets. So a combined algorithm of efficient matching algorithms seems to improve the performance and efficiency of Information Content Security Systems.