Performance Analysis of Machine Learning and Pattern Matching Techniques for Deep Packet Inspection in Firewalls

J.V. BibalBenifa,Saravanan Krishnann,Hoang Long,Raghvendra Kumar,David Taniar
DOI: https://doi.org/10.21203/rs.3.rs-260788/v1
2021-09-07
Abstract:Abstract Malware is essentially one of the major security issues that have the potential to break the computer operations instantly. Majority of the internet attacks are caused by malwares that are being distributed through HTTP over the Internet. A Firewall is essential to prevent such internet attacks for enhancing the security measures. The most efficient method to prevent Intrusion in the network is Deep Packet Inspection (DPI), which is presently implemented in advanced firewalls. This research work intends to detect and prevent the intrusion in the network using a hybrid method with DPI, Pattern Matching (PM), and Machine Learning (ML) techniques. In this present work, a hybrid method which involves the functionalities of both DPI and ML is used for classification and identification of attacks. Here, DPI is done by Boyer-Moore-Horspool (BMHP) pattern matching algorithm and ten ML algorithms such as Support Vector Machines (SVM), Linear-SVM (L-SVM), K-Nearest Neighbors (KNN), Multi-Layer Perceptron (MLP), Decision Tree (DT), Random Forest (RF), AdaBoost (Ada), Gaussian Naive Bayes (GaNB) and Bernouli Naive Bayes (BeNB) are employed for classification. Subsequently, the proposed work is evaluated in a sequential and parallel manner and it is customized for identifying the fuzzy, impersonation and Denial of Service (DoS)-based attacks. The proposed system is analyzed in different dimensions such as performance of ML methods and role of DPI in attack identification including the pattern matching efficiency. From the investigation, it is identified that BMHP algorithm has the least time and memory consumed values about 0.0028 sec and 125.4 Mib respectively. Similarly, SVM has the accuracy of 99.91% with the least time and memory consumed values about 18.185 sec and 303.5 MiB respectively.
What problem does this paper attempt to address?