Deep learning vs. adversarial noise: a battle in malware image analysis
K. A. Asmitha,Vinod Puthuvath,K. A. Rafidha Rehiman,S. L. Ananth
DOI: https://doi.org/10.1007/s10586-024-04397-4
2024-04-19
Cluster Computing
Abstract:The proliferation of malware variants has shown a steep increase, attributed to their enhanced sophistication and the utilization of the latest technologies. This constitutes a severe menace to smart gadgets and IT infrastructure. Malware visualization has emerged as an exceptionally attractive technique, primarily because it obviates the need for disassembly or code execution. In this approach, malicious executables are transformed into visual representations resembling images. This visual representation allows for the extraction of textural features using the Local Binary Pattern (LBP) technique. Subsequently, classification models are constructed using ResNet50, VGG16, and customized models tailored to the specific task. These model undergoes extensive evaluation through two benchmark datasets: the MalImg dataset (consisting of 9,342 instances of malware across 25 families) and the Malware Classification Challenge dataset (BIG2015) (with 10,868 labeled malware instances across nine families). Additionally, the model is validated on a self-made dataset, which we named Malhub, consisting of 26,452 executables comprising 20 families. Furthermore, we implemented a white-box adversarial attack using additive noise (Gaussian, Local Variable, Poisson, Salt and Pepper, Speckle). We observed an F1 score in the range of 0.992 0.993 for MalImg, 0.874 0.878 for BIG2015, and 0.014 0.992 for Malhub dataset. This proves that efforts are required to tune machine learning models to detect adversarial examples.
computer science, information systems, theory & methods