Malware classification based on double byte feature encoding

Lin Li,Ying Ding,Bo Li,Mengqing Qiao,Biao Ye
DOI: https://doi.org/10.1016/j.aej.2021.04.076
IF: 6.626
2022-01-01
Alexandria Engineering Journal
Abstract:Many researchers analyze malware through static analysis and dynamic analysis technology, and combine it with excellent deep learning algorithm, which has achieved good results in malware classification. However, many researches only use the. ASM file generated by decompiler or. Bytes file represented by hexadecimal for feature extraction. This paper fully integrates the features of these two files, and uses word frequency and two deep learning algorithms to extract 184 opcode features and 16 probability features from ASM file and section file of Kaggle dataset respectively. Then, double byte feature coding method is used to fuse the features of the two files. Finally, convolution neural network is used to classify the fused samples. The experimental results show that the accuracy is 98.68% and the logarithm loss is 0.022.
engineering, multidisciplinary
What problem does this paper attempt to address?