Explainability Guided Adversarial Evasion Attacks on Malware Detectors

Kshitiz Aryal,Maanak Gupta,Mahmoud Abdelsalam,Moustafa Saleh
2024-05-03
Abstract:As the focus on security of Artificial Intelligence (AI) is becoming paramount, research on crafting and inserting optimal adversarial perturbations has become increasingly critical. In the malware domain, this adversarial sample generation relies heavily on the accuracy and placement of crafted perturbation with the goal of evading a trained classifier. This work focuses on applying explainability techniques to enhance the adversarial evasion attack on a machine-learning-based Windows PE malware detector. The explainable tool identifies the regions of PE malware files that have the most significant impact on the decision-making process of a given malware detector, and therefore, the same regions can be leveraged to inject the adversarial perturbation for maximum efficiency. Profiling all the PE malware file regions based on their impact on the malware detector's decision enables the derivation of an efficient strategy for identifying the optimal location for perturbation injection. The strategy should incorporate the region's significance in influencing the malware detector's decision and the sensitivity of the PE malware file's integrity towards modifying that region. To assess the utility of explainable AI in crafting an adversarial sample of Windows PE malware, we utilize the DeepExplainer module of SHAP for determining the contribution of each region of PE malware to its detection by a CNN-based malware detector, MalConv. Furthermore, we analyzed the significance of SHAP values at a more granular level by subdividing each section of Windows PE into small subsections. We then performed an adversarial evasion attack on the subsections based on the corresponding SHAP values of the byte sequences.
Cryptography and Security
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve how to use explainability techniques to enhance the efficiency and success rate of adversarial evasion attacks (AE) on Windows PE malware detectors. Specifically, researchers hope to identify the areas in PE files that have the greatest impact on detection results through explainability techniques, and on this basis, inject adversarial perturbations to achieve highly efficient evasion attacks while preserving functionality. #### Background and Problem Description 1. **Challenges of Adversarial Evasion Attacks (AE)**: - The goal of adversarial evasion attacks is to modify malware files so that they are misclassified as benign files by machine - learning models. - In the malware field, due to the strict semantic constraints of binary executable files, random perturbations may destroy the executability and functionality of files, so the location and method of perturbations need to be carefully designed. 2. **Limitations of Existing Methods**: - Early research mainly focused on conducting adversarial attacks in the feature space. This method cannot be directly mapped to the actual file structure (Problem Space), which limits its practicality. - Although recent research has attempted to inject perturbations in different areas of PE files (such as headers, free spaces, etc.), it lacks a comprehensive analysis of the entire PE file structure, especially the differences in perturbation effects in different areas. 3. **Application of Explainability Techniques**: - Explainability techniques (such as SHAP values) can help understand the decision - making process of machine - learning models, thereby finding the file areas that have the greatest impact on detection results. - By using explainability techniques to guide the injection location of adversarial perturbations, the success rate and efficiency of evasion attacks can be improved while maintaining the functional integrity of malware. #### Research Objectives - **Improve the Efficiency of Adversarial Evasion Attacks**: Identify the areas in PE files that have the greatest impact on detection results through explainability techniques (such as SHAP values) and optimize the perturbation injection strategy. - **Evasion Attacks with Function Preservation**: Ensure that the injected perturbations do not destroy the executability and functionality of malware. - **Fine - grained Perturbation Injection**: Further divide each part of the PE file into smaller sub - areas, and select the optimal perturbation injection location based on SHAP values to achieve more refined control. #### Method Overview 1. **Calculate SHAP Values**: Use the DeepExplainer module to calculate the SHAP values of each byte and evaluate their impact on the detection results. 2. **Area Selection**: According to the aggregated SHAP values, select the PE areas that have the greatest impact on the detection results as the targets for perturbation injection. 3. **Perturbation Generation**: In the selected target areas, generate adversarial perturbations through methods such as gradient descent to ensure the effectiveness and functionality preservation of perturbations. 4. **Experimental Verification**: Through experiments, evaluate the effect of the SHAP - value - based perturbation injection strategy compared to random perturbations and verify its superiority in evasion attacks. ### Summary The main contribution of this paper is to propose an adversarial evasion attack strategy based on explainability techniques (such as SHAP values), which can significantly improve the success rate and efficiency of evasion attacks without destroying the functionality of malware. This provides new ideas and technical means for research in the field of adversarial attacks.