Generative AI-Based Effective Malware Detection for Embedded Computing Systems

Sreenitha Kasarapu,Sanket Shukla,Rakibul Hassan,Avesta Sasan,Houman Homayoun,Sai Manoj Pudukotai Dinakarrao

2024-04-13

Abstract:One of the pivotal security threats for the embedded computing systems is malicious software a.k.a malware. With efficiency and efficacy, Machine Learning (ML) has been widely adopted for malware detection in recent times. Despite being efficient, the existing techniques require a tremendous number of benign and malware samples for training and modeling an efficient malware detector. Furthermore, such constraints limit the detection of emerging malware samples due to the lack of sufficient malware samples required for efficient training. To address such concerns, we introduce a code-aware data generation technique that generates multiple mutated samples of the limitedly seen malware by the devices. Loss minimization ensures that the generated samples closely mimic the limitedly seen malware and mitigate the impractical samples. Such developed malware is further incorporated into the training set to formulate the model that can efficiently detect the emerging malware despite having limited exposure. The experimental results demonstrates that the proposed technique achieves an accuracy of 90% in detecting limitedly seen malware, which is approximately 3x more than the accuracy attained by state-of-the-art techniques.

Cryptography and Security,Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: how to effectively detect malware in embedded computing systems, especially when the training samples are limited. Specifically, existing techniques require a large number of benign and malware samples for training, but in practical applications, it is very difficult to obtain enough newly - emerging malware samples, resulting in the limited detection ability of these methods for emerging malware. In addition, malware developers make malware difficult to be detected by traditional static and dynamic analysis methods through means such as code obfuscation, metamorphism and polymorphism. To solve these problems, the paper proposes a code - aware data generation technique based on Generative Adversarial Networks (GANs), which can generate mutated malware samples, thereby alleviating the problem of insufficient samples and improving the detection ability of newly - emerging complex malware. The following are the main contributions of the paper: 1. **Introduced a code - aware generative AI architecture** for increasing the training data set. 2. **Adopted a loss - minimization technique** to ensure that the generated data can capture the code patterns and their functions of the complex malware observed with limitations. 3. **Used few - shot learning** to efficiently classify complex stealthy malware and code - obfuscated malware. The experimental results show that the proposed technique can achieve an accuracy rate of about 90% when using only limited samples, which is about 9% higher than the classifier trained with only limited samples. ### Formula Representation - The data set \(D\) contains four types of samples: benign samples \(B\), traditional malware \(M\), randomly obfuscated malware \(O_m\), and stealthy malware \(S_m\): \[ D=\{B + M+O_m + S_m\} \] - The limited - data - version data set \(D_x^l\) is randomly drawn from the entire data set \(D_n\) and contains no more than \(\nabla\%\) of the original number of samples \(n\): \[ D_x^l\subset D_n; \forall x\leq\nabla\%n \] - The classifier \(C\) needs to be able to distinguish between benign samples and various types of malware in the limited samples: \[ C:(D_l)\Rightarrow(B, M, O_m, S_m) \] In this way, the paper aims to improve the detection performance of complex malware under the condition of limited samples.

Generative AI-Based Effective Malware Detection for Embedded Computing Systems

Black-Box Adversarial Attacks Against Deep Learning Based Malware Binaries Detection with GAN

Malware Analysis Using Machine Learning and Deep Learning Techniques

An IRL-based malware adversarial generation method to evade anti-malware engines

An Efficient DenseNet-Based Deep Learning Model for Malware Detection

Query-Free Evasion Attacks Against Machine Learning-Based Malware Detectors with Generative Adversarial Networks

Malware Detection in Adversarial Settings

MalwareTotal: Multi-Faceted and Sequence-Aware Bypass Tactics Against Static Malware Detection

Adversarial Deep Learning for Robust Detection of Binary Encoded Malware

Robust Intelligent Malware Detection Using Deep Learning

Artificial Intelligence-Based Malware Detection, Analysis, and Mitigation

A Novel Malware Detection System Based On Machine Learning and Binary Visualization

Flexible Android Malware Detection Model based on Generative Adversarial Networks with Code Tensor

Leveraging LSTM and GAN for Modern Malware Detection

Deep Neural Network Based Malware Detection Using Two Dimensional Binary Program Features

A novel machine learning approach for detecting first-time-appeared malware

Design and Performance Analysis of an Anti-Malware System Based on Generative Adversarial Network Framework

Malware Detection and Prevention using Artificial Intelligence Techniques

Single-Shot Black-Box Adversarial Attacks Against Malware Detectors: A Causal Language Model Approach

Detection of Malicious Software by Analyzing Distinct Artifacts Using Machine Learning and Deep Learning Algorithms

Malicious Code Detection Using Machine Learning