Deepcg: Classifying Metamorphic Malware Through Deep Learning Of Call Graphs

Shuang Zhao,Xiaobo Ma,Wei Zou,Bo Bai
DOI: https://doi.org/10.1007/978-3-030-37228-6_9
2019-01-01
Abstract:As the state-of-the-art malware obfuscation technique, metamorphism has received wide attention. Metamorphic malware can mutate themselves into countless variants during propagation by obfuscating part of their executable code automatically, thus posing serious challenges to all existing detection methods. To address this problem, a fundamental task is to understand the stable features that are relatively invariant across all variants of a certain type of metamorphic malware while distinguishable from other types. In this paper, we systematically study the obfuscation methods of metamorphic malware, and reveal that, compared to frequently used fragmented features such as byte n-grams and opcode sequences, call graphs are more stable against metamorphism, and can be leveraged to classify metamorphic malware effectively. Based on call graphs, we design a metamorphic malware classification method, dubbed deepCG, which enables automatic feature learning of metamorphic malware via deep learning. Specifically, we encapsulate the information of each call graph into an image that is then fed into deep convolutional neural networks for classifying the malware family. Particularly, due to its built-in training data enhancement approach, deepCG can achieve promising classification accuracy even with small-scale training samples. We evaluate deepCG using a PE malware dataset and the Microsoft BIG2015 dataset, and achieve a test accuracy of above 96%.
What problem does this paper attempt to address?