A Comparison of Adversarial Learning Techniques for Malware Detection

Pavla Louthánová,Matouš Kozák,Martin Jureček,Mark Stamp

DOI: https://doi.org/10.1007/s11416-024-00519-z

2023-08-19

Abstract:Machine learning has proven to be a useful tool for automated malware detection, but machine learning models have also been shown to be vulnerable to adversarial attacks. This article addresses the problem of generating adversarial malware samples, specifically malicious Windows Portable Executable files. We summarize and compare work that has focused on adversarial machine learning for malware detection. We use gradient-based, evolutionary algorithm-based, and reinforcement-based methods to generate adversarial samples, and then test the generated samples against selected antivirus products. We compare the selected methods in terms of accuracy and practical applicability. The results show that applying optimized modifications to previously detected malware can lead to incorrect classification of the file as benign. It is also known that generated malware samples can be successfully used against detection models other than those used to generate them and that using combinations of generators can create new samples that evade detection. Experiments show that the Gym-malware generator, which uses a reinforcement learning approach, has the greatest practical potential. This generator achieved an average sample generation time of 5.73 seconds and the highest average evasion rate of 44.11%. Using the Gym-malware generator in combination with itself improved the evasion rate to 58.35%.

Cryptography and Security,Machine Learning

What problem does this paper attempt to address?

The paper attempts to address the problem of generating adversarial malware samples, specifically targeting Windows Portable Executable files (PE files). The paper summarizes and compares the applications of adversarial machine learning in the field of malware detection, exploring how to generate adversarial samples using gradient-based, evolutionary algorithm-based, and reinforcement learning-based methods, and tests the effectiveness of these generated samples against selected antivirus products. The paper focuses on the performance of these methods in terms of accuracy and practical application, as well as whether optimizing modifications to already detected malware would lead to the files being misclassified as benign. The research results indicate that optimizing modifications can lead to files being misclassified as benign, and that the generated malware samples can not only be successfully used beyond the detection models that generated them but also create new samples to evade detection when combining multiple generators. Experiments show that the Gym-malware generator using reinforcement learning methods has the greatest practical potential, with an average sample generation time of 5.73 seconds and the highest average evasion rate of 44.11%, while the evasion rate increased to 58.35% when the Gym-malware generator was used in combination with itself.

A Comparison of Adversarial Learning Techniques for Malware Detection

Creating Valid Adversarial Examples of Malware

Black-Box Adversarial Attacks Against Deep Learning Based Malware Binaries Detection with GAN

A Comparison of State-of-the-Art Techniques for Generating Adversarial Malware Binaries

An IRL-based malware adversarial generation method to evade anti-malware engines

Adversarial Malware Binaries: Evading Deep Learning for Malware Detection in Executables

Adversarial Deep Learning for Robust Detection of Binary Encoded Malware

Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN.

Query-Free Evasion Attacks Against Machine Learning-Based Malware Detectors with Generative Adversarial Networks

On the Effectiveness of Adversarial Samples against Ensemble Learning-based Windows PE Malware Detectors

Single-Shot Black-Box Adversarial Attacks Against Malware Detectors: A Causal Language Model Approach

FGAM:Fast Adversarial Malware Generation Method Based on Gradient Sign

MalwareTotal: Multi-Faceted and Sequence-Aware Bypass Tactics Against Static Malware Detection

Attack and Defense of Dynamic Analysis-Based, Adversarial Neural Malware Classification Models

Updating Windows Malware Detectors: Balancing Robustness and Regression against Adversarial EXEmples

Adversarial-Example Attacks Toward Android Malware Detection System

From Image to Code: Executable Adversarial Examples of Android Applications.

Malware Detection in Adversarial Settings

ATMPA: Attacking Machine Learning-based Malware Visualization Detection Methods via Adversarial Examples

ATWM: Defense against adversarial malware based on adversarial training

Deep learning vs. adversarial noise: a battle in malware image analysis