Abstract:Adversarial Malware Generation (AMG), the generation of adversarial malware variants to strengthen Deep Learning (DL)-based malware detectors has emerged as a crucial tool in the development of proactive cyberdefense. However, the majority of extant works offer subtle perturbations or additions to executable files and do not explore full-file obfuscation. In this study, we show that an open-source encryption tool coupled with a Reinforcement Learning (RL) framework can successfully obfuscate malware to evade state-of-the-art malware detection engines and outperform techniques that use advanced modification methods. Our results show that the proposed method improves the evasion rate from 27%-49% compared to widely-used state-of-the-art reinforcement learning-based methods.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: **How to generate adversarial malware variants that can evade deep - learning - based malware detectors through obfuscation techniques**. Specifically, the author aims to explore the use of deep reinforcement learning (DRL) combined with open - source obfuscation tools to obfuscate the entire malware file, thereby increasing the success rate of malware evading detection. ### Problem Background Most of the existing Adversarial Malware Generation (AMG) research focuses on making minor modifications or additions to executable files, and less on obfuscating the entire file. Such minor modifications may reduce the stealth of adversarial variants and limit the potential of AMG techniques. In addition, these methods may not be able to fully simulate the real behavior of hackers in practical applications, because in reality, hackers usually perform complex operations such as encrypting, packing, or encoding on the entire file. ### Research Objectives To overcome the above limitations, this research proposes a framework named OBFU - mal, which uses deep reinforcement learning and open - source obfuscation tools (such as Darkarmour) to automatically generate adversarial malware variants. Specific objectives include: 1. **Increase the success rate of evading detection**: By introducing new obfuscation actions (such as multi - layer XOR encryption), malware can more effectively evade deep - learning - based detectors. 2. **Simulate real - world attack behaviors**: By using the same techniques and tools as actual hackers, the generated adversarial samples are closer to the real threat environment. 3. **Enhance the robustness of malware detectors**: By retraining the generated adversarial samples, the detector's defense ability against future attacks is improved. ### Method Overview The core of the OBFU - mal framework is a reinforcement learning agent based on Deep Q - Network (DQN), which can select the optimal sequence of actions in the extended action space to obfuscate malware. The action space includes a variety of obfuscation operations, such as: - **Append bytes** - **Modify import table** - **Rename sections** - **Remove signature** - **Compress file (using UPX)** - **Apply XOR encryption loops** By iteratively applying these actions and adjusting according to the detector's feedback, OBFU - mal can generate highly obfuscated malware variants and significantly improve their ability to evade detection. ### Experimental Results The experimental results show that the malware variants generated by OBFU - mal perform excellently in evading two mainstream malware detectors, MalConv and LGBM/EMBER. Specifically, the average evasion rate of OBFU - mal on MalConv is 65.15%, and on LGBM/EMBER is 79.20%, both significantly higher than the existing benchmark methods. ### Conclusion This research demonstrates the potential of deep reinforcement learning combined with full - file obfuscation techniques in generating adversarial malware. Through this method, not only can the success rate of malware evading detection be increased, but also the real - world attack behaviors can be better simulated, thus providing a valuable reference for developing more powerful malware detection systems. ### Formula Representation In this paper, the evasion rate \(E\) is defined as follows: \[E=\frac{M_e}{M_t}\] where: - \(M_e\) represents the number of samples that successfully evade detection. - \(M_t\) represents the total number of generated adversarial samples. This formula is used to quantify the performance of different methods in terms of evading detection.

Evading Deep Learning-Based Malware Detectors via Obfuscation: A Deep Reinforcement Learning Approach

Evading Deep Learning-Based Malware Detectors via Obfuscation: A Deep Reinforcement Learning Approach

Black-Box Adversarial Attacks Against Deep Learning Based Malware Binaries Detection with GAN

Adversarial Malware Binaries: Evading Deep Learning for Malware Detection in Executables

Binary Black-box Evasion Attacks Against Deep Learning-based Static Malware Detectors with Adversarial Byte-Level Language Model

DeepMal: maliciousness-Preserving adversarial instruction learning against static malware detection

An IRL-based malware adversarial generation method to evade anti-malware engines

Malware Analysis Using Machine Learning and Deep Learning Techniques

PAD: Towards Principled Adversarial Malware Detection Against Evasion Attacks

Single-Shot Black-Box Adversarial Attacks Against Malware Detectors: A Causal Language Model Approach

Adversarial Deep Learning for Robust Detection of Binary Encoded Malware

Multi-view Representation Learning from Malware to Defend Against Adversarial Variants

MalwareTotal: Multi-Faceted and Sequence-Aware Bypass Tactics Against Static Malware Detection

Neural Malware Control with Deep Reinforcement Learning.

Exposing Weaknesses of Malware Detectors with Explainability-Guided Evasion Attacks

Creating Valid Adversarial Examples of Malware

Semantics-Preserving Reinforcement Learning Attack Against Graph Neural Networks for Malware Detection

Semantic-preserving Reinforcement Learning Attack Against Graph Neural Networks for Malware Detection

Robust Android Malware Detection System against Adversarial Attacks using Q-Learning

Protecting from Malware Obfuscation Attacks through Adversarial Risk Analysis

A Wolf in Sheep's Clothing: Practical Black-box Adversarial Attacks for Evading Learning-based Windows Malware Detection in the Wild