A Flow is a Stream of Packets: A Stream-Structured Data Approach for DDoS Detection

Raja Giryes,Lior Shafir,Avishai Wool
DOI: https://doi.org/10.48550/arXiv.2405.07232
2024-05-12
Abstract:Distributed Denial of Service (DDoS) attacks are getting increasingly harmful to the Internet, showing no signs of slowing down. Developing an accurate detection mechanism to thwart DDoS attacks is still a big challenge due to the rich variety of these attacks and the emergence of new attack vectors. In this paper, we propose a new tree-based DDoS detection approach that operates on a flow as a stream structure, rather than the traditional fixed-size record structure containing aggregated flow statistics. Although aggregated flow records have gained popularity over the past decade, providing an effective means for flow-based intrusion detection by inspecting only a fraction of the total traffic volume, they are inherently constrained. Their detection precision is limited not only by the lack of packet payloads, but also by their structure, which is unable to model fine-grained inter-packet relations, such as packet order and temporal relations. Additionally, inferring aggregated flow statistics must wait for the complete flow to end. Here we show that considering flow inputs as variable-length streams composed of their associated packet headers, allows for very accurate and fast detection of malicious flows. We evaluate our proposed strategy on the CICDDoS2019 and CICIDS2017 datasets, which contain a comprehensive variety of DDoS attacks. Our approach matches or exceeds existing machine learning techniques' accuracy, including state-of-the-art deep learning methods. Furthermore, our method achieves significantly earlier detection, e.g., with CICDDoS2019 detection based on the first 2 packets, which corresponds to an average time-saving of 99.79% and uses only 4--6% of the traffic volume.
Cryptography and Security
What problem does this paper attempt to address?
This paper attempts to solve several key problems in Distributed Denial - of - Service (DDoS) attack detection: 1. **Limitations of traditional flow - recording methods**: Traditional DDoS detection methods rely on aggregated flow records, which contain statistical information about the flows. However, this method has the following shortcomings: - **Lack of fine - grained information**: It is unable to capture the order and time relationships between packets. - **Limited detection accuracy**: Due to the lack of payload information of packets, the ability of deep packet inspection is limited. - **Delayed detection**: Aggregated statistical information can only be generated until the entire flow ends, resulting in detection delay. 2. **Improving detection speed and accuracy**: In order to cope with increasingly complex and diverse DDoS attacks, a faster and more accurate detection mechanism is required. Existing methods are often not timely and accurate enough when dealing with large - scale traffic. 3. **Low resource consumption**: In modern network environments, traffic is huge and complex, so a detection method that can use resources efficiently is needed to reduce computing and storage costs. To solve these problems, the author proposes a new tree - based DDoS detection method. This method regards the flow as a variable - length flow structure composed of packet headers, rather than the traditional fixed - size aggregated flow records. Through this method, early detection of malicious traffic can be achieved, and the speed and accuracy of detection can be significantly improved. Specifically: - **Direct analysis of packet headers**: Through direct analysis of packet headers, the order of events and time relationships within the flow can be captured, thereby improving detection accuracy. - **Early detection**: Accurate detection can be carried out using only the first few packets of the flow, greatly reducing the time and traffic required for detection. - **Compact data representation**: Compared with traditional methods, this method only needs to process 4 - 6% of the total traffic, greatly reducing resource consumption. Through evaluation on the CICDDoS2019 and CICIDS2017 datasets, this method not only matches or exceeds the accuracy of existing machine - learning techniques, but also shows significant advantages in detection speed, saving an average of 99.79% of the flow duration.