Skip Finite Automaton: A Content Scanning Engine to Secure Enterprise Networks

Junchen Jiang,Yi Tang,Bin Liu,Yang Xu,Xiaofei Wang
DOI: https://doi.org/10.1109/GLOCOM.2010.5683165
2010-01-01
Abstract:Today's file sharing networks are creating potential security problems to enterprise networks, i.e., the leakage of confidential documents. In order to prevent such leakage, we propose the Data Leakage Prevention System (DLPS) which is applied at the entrance of the enterprise network to filter out the outgoing sensitive information. The DLPS is based on a content scanning engine which defines a new type of matching problem, called longest overlap matching which also exits in many other applications as a basic problem where contents are delivered by small blocks. We study the problem by comparing it with the traditional pattern matching problem in Deep Packet Inspection (DPI) of Network Intrusion Detection Systems (NIDS) whose solutions are based on finite automata. We develop a new finite automata representation called Skip-Finite Automata (Skip-FA) which detects the packets carrying sensitive information by using default transitions to implicitly track the overlapping parts between packets' payloads and sensitive files. The simulation results shows that our system achieves a matching speed of about 10B+ per memory access for small file set (>;20KB) and 100B+ per memory access for large file set (>;2500KB). We also find that the memory consumption of Skip-FA is almost the same to that of the original files.
What problem does this paper attempt to address?