Misleading Malware Similarities Analysis By Automatic Data Structure Obfuscation

Zhi Xin,Huiyu Chen,Hao Han,Bing Mao,Li Xie
DOI: https://doi.org/10.1007/978-3-642-18178-8_16
2011-01-01
Abstract:Program obfuscation techniques have been widely used by malware to dodge the scanning from anti-virus detectors. However, signature based on the data structures appearing in the runtime memory makes traditional code obfuscation useless. Laika [2] implements this signature using Bayesian unsupervised learning, which clusters similar vectors of bytes in memory into the same class. We present a novel malware obfuscation technique that automatically obfuscate the data structure layout so that memory similarities between malware programs are blurred and hardly recognized. We design and implement the automatic data structure obfuscation technique as a GNU GCC compiler extension that can automatically distinguish the obfuscability of the data structures and convert part of the unobfuscable data structures into obfuscable. After evaluated by fourteen real-world malware programs, we present that our tool maintains a high proportion of obfuscated data structures as 60.19% for type and 60.49% for variable.
What problem does this paper attempt to address?