OpenSMax: Unknown Domain Generation Algorithm Detection.

Yao Lai,Guolou Ping,Yuexin Wu,Chenhui Lu,Xiaojun Ye
DOI: https://doi.org/10.3233/faia200301
2020-01-01
Abstract:Botnet has become one of the most frequent attack patterns in cyberspace, and most of them are concerned with Domain Generation Algorithms (DGAs). Therefore, many researchers have proposed various machine learning models for DGA domain name detection, but how to detect unknown classes of DGA domain names (unknown DGAs) is still a challenging problem. In fact, the problem of detecting unknown classes is also called open set recognition problem. To tackle this issue, we propose a novel classification model OpenSMax which can not only detect various DGA domain names but also classify them into known and unknown classes of DGAs. In this model, we use the one-hot encoding method and the Long Short-Term Memory (LSTM) model to extract the features of the Top Level Domain (TLD) and the Second Level Domain (SLD) respectively. Then, these two feature categories are concatenated and propagated forwards by two fully connected layers for known DGA domain name detection and classification. Finally, both the openmax layer (the layer before the softmax layer) and the softmax layer are used to build One-Class Support Vector Machine (SVM) models for unknown classes recognition. In our experiments, OpenSMax model outperforms the state-of-art methods both in known and unknown DGA domain names detection tasks. Also, OpenSMax provides a bounded open space risk in theory, and therefore it formally provides an effective solution for unknown DGA domain name detection.
What problem does this paper attempt to address?