Local Augmentation with Functionality-Preservation for Semi-Supervised Graph Intrusion Detection
Junpeng He,Hanyue Kong,Shihe Zhang,Xiong Li,Weina Niu,Xiaosong Zhang,Fagen Li
DOI: https://doi.org/10.1109/icc51166.2024.10622440
2024-01-01
Abstract:Recently, deep learning (DL)-driven intrusion detection technology has been rising gradually to reduce the economic and privacy losses caused by the dramatic increase in cyber attacks. Besides exploiting the statistical network traffic features, the inherent attack topologies are also important as they are highly associated with attack behaviors. Thus, many works use graph neural networks (GNN) to make use of them to improve the detection performance. Nevertheless, these topologies contain many isolated IPs that interact with far fewer targets than other non-isolated IPs. Due to the message-passing scheme, information from isolated IPs is less likely to be transmitted in GNN training, resulting in poor model understanding of the traffic connecting these isolated IPs. Therefore, the performance of GNN-based intrusion detection models is not satisfactory for this traffic. To address this problem, we implement a generation algorithm with traffic functionality preservation to improve the performance of GNN-based intrusion detectors by locally augmenting this type of traffic. The proposed method first converts the IP-based graph of the traffic dataset into a line graph, and then utilizes a conditional denoising diffusion probabilistic model to generate new graph snapshots to enhance the expressiveness of isolated IP in GNN message aggregation. We evaluate the performance of our method and compare it with state-of-the-art works on three datasets, i.e., NF-BoT-IoT-V2, NF-ToN-IoT-V2, and NF-CSECIC-IDS2018-V2, and the results show it achieves accuracy of 99.11%(↑0.11%), 93.84%(↑2.47%), and 97.15%(↑3.94%), respectively. Case study shows that the proposed local augmentation can improve detection performance under different isolated thresholds. Moreover, our method can alleviate the over-smoothing problem to a certain extent, and the augmented traffic also possesses better quality.