DGA botnet detection method based on capsule network and k-means routing

Xiaoyang Liu,Jiamiao Liu
DOI: https://doi.org/10.1007/s00521-022-06904-3
2022-01-31
Neural Computing and Applications
Abstract:For the current mainstream DGA domain name detection methods, scalars are almost used to represent numerical features, resulting in the loss of the spatial feature information of domain name characters. This paper proposes a sequence capsule network based on the k-means routing algorithm, LSTM-CapsNet, which only uses DGA domain name text information for detection. The model uses a bidirectional LSTM unit to extract basic features for the capsule network and uses the k-means algorithm to cluster vector features to implement routing functions. In order to verify the proposed LSTM-CapsNet model, data from two different sources are collected to ensure the reliability of the experiment, covering the DGA domain name dataset from the real network defined as Real-Dataset, and the DGA domain name obtained through the domain name generation algorithm is defined as Gen-Dataset. The current DGA domain name detection method of state-of-the-art proposed by researchers is compared and tested on two data sets. The experimental results show that the proposed model has achieved 99.17% and 97.75% of the F-score evaluation indicators in the DGA domain name recognition of the two datasets; at the same time, the recognition of the DGA domain name family has been very competitive. Compared with the existing DGA domain name family classification model, the F-score value of the proposed model exceeds 89% in Gen-Dataset multi-class recognition. This model not only improves the ability of DGA domain name recognition and DGA domain name family recognition but also has an outstanding ability to find real-time aspects in model testing.
computer science, artificial intelligence
What problem does this paper attempt to address?