Extraction of Fingerprint from Regular Expression for Efficient Prefiltering

Xiaofei Wang,Junchen Jiang,Wei Lin,Yi Tang,Xiaojun Wang,Bin Liu
DOI: https://doi.org/10.1109/iccomta.2009.5349207
2009-01-01
Abstract:Deep packet inspection at high speed has become extremely important due to its application in a wide range of network applications, such as network security and network monitoring. Network intrusion detection system (NIDS) uses a collection of signatures of known security threats and viruses to scan the payload of each packet. Signatures are often specified in the form of regular expressions (regex), called patterns, which are traditionally implemented as finite automata. Deterministic finite automata (DFA) is fast, but requires prohibitive amounts of memory which limits their practical use. Instead of matching an incoming packet with each individual regex in a ruleset, we match the packet with a fixed substring, called fingerprint, of a regex first. Fixed string matching is faster and consumes less energy than regex matching. The fact is that if a packet does not match with the fingerprint of a regex, it will not match the regex itself. So fingerprints can be used in a prefilter engine to filter out those packets and do not match any of the fingerprints of the regex in a rule set, which represents normal non-malicious traffic. This actually reduces the number of regex rules being matched, which results in increased throughput of the NIDS. We present a weight scheme to extract a good fingerprint from a regex. A good fingerprint is the one that not only indicates the regex uniquely, but also occurs as less as possible in the matching procedure. We demonstrate how to use fingerprints for efficient prefiltering by means of Bloom filters in practice.
What problem does this paper attempt to address?