Abstract:Due to advancements in malware competencies, cyber-attacks have been broadly observed in the digital world. Cyber-attacks can hit an organization hard by causing several damages such as data breach, financial loss, and reputation loss. Some of the most prominent examples of ransomware attacks in history are WannaCry and Petya, which impacted companies' finances throughout the globe. Both WannaCry and Petya caused operational processes inoperable by targeting critical infrastructure. It is quite impossible for anti-virus applications using traditional signature-based methods to detect this type of malware because they have different characteristics on each contaminated computer. The most important feature of this type of malware is that they change their contents using their mutation engines to create another hash representation of the executable file as they propagate from one computer to another. To overcome this method that attackers use to camouflage malware, we have created three-channel image files of malicious software. Attackers make different variants of the same software because they modify the contents of the malware. In the solution to this problem, we created variants of the images by applying data augmentation methods. This article aims to provide an image augmentation enhanced deep convolutional neural network (CNN) models for detecting malware families in a metamorphic malware environment. The main contributions of the article consist of three components, including image generation from malware samples, image augmentation, and the last one is classifying the malware families by using a CNN model. In the first component, the collected malware samples are converted into binary file to 3-channel images using the windowing technique. The second component of the system create the augmented version of the images, and the last part builds a classification model. This study uses five different deep CNN model for malware family detection. The results obtained by the classifier demonstrate accuracy up to 98%, which is quite satisfactory.

Data Augmentation in Training Deep Learning Models for Malware Family Classification

Black-Box Adversarial Attacks Against Deep Learning Based Malware Binaries Detection with GAN

A Malware Family Classification Method Based on the Point Cloud Model DGCNN

Data augmentation based malware detection using convolutional neural networks

Malware Analysis Using Machine Learning and Deep Learning Techniques

Decoding the Secrets of Machine Learning in Malware Classification: A Deep Dive into Datasets, Feature Extraction, and Model Performance

An Efficient Method for Generating Adversarial Malware Samples

Deep hybrid approach with sequential feature extraction and classification for robust malware detection

Overcoming the lack of labeled data: Training malware detection models using adversarial domain adaptation

AdvAndMal: Adversarial Training for Android Malware Detection and Family Classification

MalMixer: Few-Shot Malware Classification with Retrieval-Augmented Semi-Supervised Learning

Malware Detected and Tell MeWhy: An Verifiable Malware Detection Model with Graph Metric Learning

Data Augmentation for Opcode Sequence Based Malware Detection

Improving Android Malware Detection Through Data Augmentation Using Wasserstein Generative Adversarial Networks

A Novel Image-Based Malware Classification Model Using Deep Learning

Few-Shot Malware Classification via Attention-Based Transductive Learning Network

DeepMAL -- Deep Learning Models for Malware Traffic Detection and Classification

Catch'em all: Classification of Rare, Prominent, and Novel Malware Families

Exploring Optimal Deep Learning Models for Image-based Malware Variant Classification

An Efficient DenseNet-Based Deep Learning Model for Malware Detection

Feature-level Malware Obfuscation in Deep Learning