CNN-based DGA Detection with High Coverage

Shaofang Zhou,Lanfen Lin,Junkun Yuan,Feng Wang,Zhaoting Ling,Jia Cui
DOI: https://doi.org/10.1109/isi.2019.8823200
2019-01-01
Abstract:Attackers often use domain generation algorithms (DGAs) to create various kinds of pseudorandom domains dynamically and select a part of them to connect with command and control servers, therefore it is important to automatically detect the algorithmically generated domains (AGDs). AGDs can be broken down into two categories: character-based domains and wordlist-based domains. Recently, methods based on machine learning and deep learning have been widely explored. However, much of the previous work perform well in detecting one kind of DGA families but poorly in classifying another kind. A general detection system which is applicable to both kinds of domains still remains a challenge. To address this problem, we propose a novel real-time detection method with high accuracy as well as high coverage. We first convey a domain name into a sequence of word-level or character-level components, then design a deep neural network based on temporal convolutional network to extract the implicit pattern and classify the domain into two or more categories. Our experimental results demonstrate that our model outperforms state-of-the-art approaches in both binary classification and multi-class classification, and shows a good performance in detecting different kinds of DGAs. Besides, the high training efficiency of our model makes it adjust to new malicious domains quickly.
What problem does this paper attempt to address?