CI_GRU: an Efficient DGA Botnet Classification Model Based on an Attention Recurrence Plot.

Han Wang,Zhangguo Tang,Huanzhou Li,Jian Zhang,Shuangcheng Li,Junfeng Wang
DOI: https://doi.org/10.1016/j.comnet.2023.109992
2022-01-01
SSRN Electronic Journal
Abstract:Malware is often embedded with domain generation algorithms (DGAs) to prevent firewall interception and domain black-and-white list comparison detection while hiding command and control (C&C) servers to tighten the control of botnets. DGA domains are diverse and difficult to obtain, resulting in highly unbalanced datasets. Domain names generated by different DGA families do not differ much at the sequence data level and it is difficult to extract their features. The above characteristics lead to poor accuracy, poor generalization ability, and bloatedness of DGA domain name classification models based on deep learning. To solve the above problems, the visual representation of sequence data and the DGA domain classification model are presented in this paper. First, the DGA domain name is mapped to the attention recurrence plot (Att_RP) proposed in this paper, which can enrich the data phase space features and differentiate the key phase space features. After that, Att_RP is sent to a DGA domain name classification model (CI_GRU) proposed in this paper for data dimension transformation processing, followed by classification. Experiments show that the classification accuracy, F1_score, and recall of the model for a variety of DGA families in the wild are higher than 99%, and can also accurately classify four types of crafted DGA families. Compared with similar models, the model has high classification accuracy, low time consumption, low generalization error, and high efficiency, and the size of the model is less than one-tenth of similar models.
What problem does this paper attempt to address?