Exploit Internal Structural Information for IoT Malware Detection Based on Hierarchical Transformer Model

Xiaohui Hu,Rui Sun,Kejia Xu,Yongzheng Zhang,Peng Chang
DOI: https://doi.org/10.1109/trustcom50675.2020.00124
2020-01-01
Abstract:The number of IoT devices continues to increase, but the security of IoT devices cannot be guaranteed. Many IoT devices are infected with malware, forming huge botnets, which could launch DDoS attacks and cause heavy losses. In recent years, the IoT malware family has a tendency to be centralized on ARM-based IoT devices. The most widely spread families are the Mirai family and Gafgyt family. In this paper, we automatically extract the instruction sequences of these two families' samples and use the instruction sequences as language to describe these samples. We transfer instruction sequences to word vector space by Word2Vec. Then exploiting internal hierarchical structure of functions in malware to construct a hierarchical language model based on transformer-encoder to classify the samples. And the results obtained after visualizing the weights of the model can reflect the correlation of the functions in the sample, which can help the sample analyst find the key function. We use IoT software samples including Mirai samples, Gafgyt samples and benign samples to train our model. In the experiments, our model achieves 99.12% recall rate of malware and 94.67% family classification accuracy rate, which is better than other methods.
What problem does this paper attempt to address?