Botnet DGA Domain Name Classification Using Transformer Network with Hybrid Embedding

Ling Ding,Peng Du,Haiwei Hou,Jian Zhang,Di Jin,Shifei Ding
DOI: https://doi.org/10.1016/j.bdr.2023.100395
IF: 3.3
2023-05-13
Big Data Research
Abstract:One of the severest threats to cyber security is botnet, which typically uses domain names generated by Domain Generation Algorithms (DGAs) to communicate with their Command and Control (C&C) infrastructure. DGA detection and classification play an important role of assisting cyber security researchers to detect botnet C&C servers. However, many of the existing DGA detection models only focus on single scale word embedding method, and very few models are specially designed to extract more effective features for DGA detection from multiple scales word embedding. To alleviate above questions, first we propose a hybrid word embedding method, which combines character level embedding and bigram level embedding to make full use of the domain names information, and then, we design a deep neural network with hybrid embedding method to distinguish DGA domains from known legitimate domains. Finally, we evaluate our hybrid embedding method and the proposed model on ONIST dataset and compare our methods with several state-of-the-art DGA classification methods.
computer science, information systems, artificial intelligence, theory & methods
What problem does this paper attempt to address?