Malware Classification using a Hybrid Hidden Markov Model-Convolutional Neural Network

Ritik Mehta,Olha Jureckova,Mark Stamp
2024-12-25
Abstract:The proliferation of malware variants poses a significant challenges to traditional malware detection approaches, such as signature-based methods, necessitating the development of advanced machine learning techniques. In this research, we present a novel approach based on a hybrid architecture combining features extracted using a Hidden Markov Model (HMM), with a Convolutional Neural Network (CNN) then used for malware classification. Inspired by the strong results in previous work using an HMM-Random Forest model, we propose integrating HMMs, which serve to capture sequential patterns in opcode sequences, with CNNs, which are adept at extracting hierarchical features. We demonstrate the effectiveness of our approach on the popular Malicia dataset, and we obtain superior performance, as compared to other machine learning methods -- our results surpass the aforementioned HMM-Random Forest model. Our findings underscore the potential of hybrid HMM-CNN architectures in bolstering malware classification capabilities, offering several promising avenues for further research in the field of cybersecurity.
Machine Learning
What problem does this paper attempt to address?