Toward Generating Communication Graph Datasets for Botnet Detection in Autonomous Systems

Yuhao Yan,Bo Lang,Xiaoyuan Meng,Nan Xiao
DOI: https://doi.org/10.1109/tifs.2024.3453172
IF: 7.231
2024-09-11
IEEE Transactions on Information Forensics and Security
Abstract:Botnet is one of the main threats to cybersecurity because of its concealment and hazardous nature, especially in autonomous systems (ASs), such as campus networks. Graph-based detection methods are attracting increasing attention due to their ability to find and use the topological features of botnets. However, constructing or obtaining a botnet dataset is always difficult, and almost all existing public datasets suffer from extreme imbalances and poor authenticity, which makes training graph-based detection models challenging. To address these problems, we propose a role-based multistage growth method for generating AS botnet datasets, which is scalable and efficient. Our method generates a background communication graph based on complex network theory, models botnet behaviors by building a state machine, and generates the traffic of botnets. The experimental results show that our method can effectively restore the AS communication graph, and the generated datasets can significantly improve the performance of various graph-based detection models. Our generated dataset is available at https://github.com/Yebmoon/Botnet-graph-dataset.
computer science, theory & methods,engineering, electrical & electronic
What problem does this paper attempt to address?