Compact DFA Structure for Multiple Regular Expressions Matching

Wei Lin,Yi Tang,Bin Liu,Derek Pao,XiaoFei Wang
DOI: https://doi.org/10.1109/icc.2009.5198833
2009-01-01
Abstract:New applications such as real-time deep packet inspection require high-speed regular expression (regex) matcher, and the number of regexes in pattern store is increasing to several thousands, which requires a memory efficient solution. In this paper, a kind of hardware based compact DFA structure for multiple regexes matching called CPDFA is presented. According to statistics of regexes in Snort and L7-filter rules, transitions from each state to its next states are not evenly distributed. The summation of transitions from each state to its top three most popular next states takes about 90% of all the transitions. Therefore, CPDFA employs an indirect index table to represent transitions to top three most popular next states more efficiently. The remaining transitions which take about 10% of all the transitions are stored in direct transition table or K parallel SRAMs according to the number of remaining transitions from the same state is more than K or not. Simulation shows that CPDFA structure can save about 90% of memory storage comparing with the original DFA structure. By using pipelined architecture in FPGA, CPDFA can advance one character in one memory access cycle.
What problem does this paper attempt to address?