Evading Deep Learning-Based Malware Detectors via Obfuscation: A Deep Reinforcement Learning Approach

Brian Etter,James Lee Hu,Mohammedreza Ebrahimi,Weifeng Li,Xin Li,Hsinchun Chen
2024-02-05
Abstract:Adversarial Malware Generation (AMG), the generation of adversarial malware variants to strengthen Deep Learning (DL)-based malware detectors has emerged as a crucial tool in the development of proactive cyberdefense. However, the majority of extant works offer subtle perturbations or additions to executable files and do not explore full-file obfuscation. In this study, we show that an open-source encryption tool coupled with a Reinforcement Learning (RL) framework can successfully obfuscate malware to evade state-of-the-art malware detection engines and outperform techniques that use advanced modification methods. Our results show that the proposed method improves the evasion rate from 27%-49% compared to widely-used state-of-the-art reinforcement learning-based methods.
Cryptography and Security,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to generate adversarial malware variants that can evade deep - learning - based malware detectors through obfuscation techniques**. Specifically, the author aims to explore the use of deep reinforcement learning (DRL) combined with open - source obfuscation tools to obfuscate the entire malware file, thereby increasing the success rate of malware evading detection. ### Problem Background Most of the existing Adversarial Malware Generation (AMG) research focuses on making minor modifications or additions to executable files, and less on obfuscating the entire file. Such minor modifications may reduce the stealth of adversarial variants and limit the potential of AMG techniques. In addition, these methods may not be able to fully simulate the real behavior of hackers in practical applications, because in reality, hackers usually perform complex operations such as encrypting, packing, or encoding on the entire file. ### Research Objectives To overcome the above limitations, this research proposes a framework named OBFU - mal, which uses deep reinforcement learning and open - source obfuscation tools (such as Darkarmour) to automatically generate adversarial malware variants. Specific objectives include: 1. **Increase the success rate of evading detection**: By introducing new obfuscation actions (such as multi - layer XOR encryption), malware can more effectively evade deep - learning - based detectors. 2. **Simulate real - world attack behaviors**: By using the same techniques and tools as actual hackers, the generated adversarial samples are closer to the real threat environment. 3. **Enhance the robustness of malware detectors**: By retraining the generated adversarial samples, the detector's defense ability against future attacks is improved. ### Method Overview The core of the OBFU - mal framework is a reinforcement learning agent based on Deep Q - Network (DQN), which can select the optimal sequence of actions in the extended action space to obfuscate malware. The action space includes a variety of obfuscation operations, such as: - **Append bytes** - **Modify import table** - **Rename sections** - **Remove signature** - **Compress file (using UPX)** - **Apply XOR encryption loops** By iteratively applying these actions and adjusting according to the detector's feedback, OBFU - mal can generate highly obfuscated malware variants and significantly improve their ability to evade detection. ### Experimental Results The experimental results show that the malware variants generated by OBFU - mal perform excellently in evading two mainstream malware detectors, MalConv and LGBM/EMBER. Specifically, the average evasion rate of OBFU - mal on MalConv is 65.15%, and on LGBM/EMBER is 79.20%, both significantly higher than the existing benchmark methods. ### Conclusion This research demonstrates the potential of deep reinforcement learning combined with full - file obfuscation techniques in generating adversarial malware. Through this method, not only can the success rate of malware evading detection be increased, but also the real - world attack behaviors can be better simulated, thus providing a valuable reference for developing more powerful malware detection systems. ### Formula Representation In this paper, the evasion rate \(E\) is defined as follows: \[E=\frac{M_e}{M_t}\] where: - \(M_e\) represents the number of samples that successfully evade detection. - \(M_t\) represents the total number of generated adversarial samples. This formula is used to quantify the performance of different methods in terms of evading detection.