Enhanced Few-Shot Malware Traffic Classification via Integrating Knowledge Transfer With Neural Architecture Search

Xixi Zhang,Qin Wang,Maoyang Qin,Yu Wang,Tomoaki Ohtsuki,Bamidele Adebisi,Hikmet Sari,Guan Gui
DOI: https://doi.org/10.1109/tifs.2024.3396624
IF: 7.231
2024-05-14
IEEE Transactions on Information Forensics and Security
Abstract:Malware traffic classification (MTC) is one of the important research topics in the field of cyber security. Existing MTC methods based on deep learning have been developed based on the assumption of enough high-quality samples and powerful computing resources. However, both are hard to obtain in real applications especially in availability of IoT. In this paper, we propose a few-shot MTC (FS-MTC) method combining knowledge transfer and neural architecture search (i.e. NAS-based FS-MTC) with limited training samples as well as acceptable computational resources, in order to mitigate the identified challenges. Specifically, our proposed method first converts the raw network traffic into traffic images through data pre-processing to serve as input data for the neural network. Second, we use neural architecture search to adaptively search for the effective feature extraction model on the source domain (including Edge-IIoTset, Bot-IoT, and benign USTC-TFC2016). Third, the searched model is pre-trained on source task to achieve the generic feature representation of malware traffic. Finally, we only use few-shot malware traffic samples to fine-tune the pre-trained model to quickly adapt to new types of MTC tasks in realistic network environments. The experimental results show that the proposed NAS-based FS-MTC method has great scalability and classification performance in different FS-MTC tasks, including 5-way K-shot USTC-TFC2016 dataset and 10-way K-shot CIC-IoT dataset. Compared with state-of-the-art methods in the field of malware classification, the proposed NAS-based FS-MTC has higher classification accuracy. Especially in the 1-shot case of the USTC-TFC2016 dataset, its average accuracy is as high as 86.91%.
computer science, theory & methods,engineering, electrical & electronic
What problem does this paper attempt to address?