Abstract:Traditional malware detection systems based on signature-based detection methods cannot detect new and unseen malware. Moreover, conventional machine learning methods for malware detection have utilized features extracted through static program analysis or dynamic analysis, which requires code debugging and execution primarily through offline processing; hence not a scalable approach. This paper proposes a novel intelligent malware classification using a deep convolution neural network (IMCNN) in organizational networks enabled with Honeypots. Systematic customization of pre-trained convolutional neural networks(CNN) as a transfer learning and ensemble learning as a classification is presented to detect intelligent modern-day malware. Real-world malware samples are systematically labeled and visualized into grayscale images. Four cutting-edge deep CNN models - VGG16, VGG19, InceptionV3, and ResNet50, are trained on the ImageNet database ( ≥ 1 million) and fine-tuned as feature extractors along with a basic CNN model. Three strategies are designed for feature extraction and selection: Rectified linear unit (ReLU) fully connected layer embedded in a deep CNN model, principal component analysis (PCA), and singular value decomposition(SVD). Reduced sets of features are stacked and used to train k-nearest neighbor (k-NN), support vector machine (SVM), and random forest (RF) classifiers for predictions. Subsequently, the predictive probabilities of different machine-learned models are ensembled using a soft voting method for final classification. The proposed method is evaluated on MalImg datasets (9339 malware samples of 25 families) and real-world modern malware datasets (690 malware of 22 families). The experimental results reveal that despite using a reduced feature set, the IMCNN effectively detects malware with 99.36% test accuracy on unseen data for MalImg datasets and 92.11% for real-world malware. In addition, the proposed method is compared with several existing state-of-art malware detection models in terms of performance accuracy and found performing as the best. Experiments demonstrated that the proposed method is resilient to polymorphic code obfuscation used by the malware authors.

Malware classification based on heterogeneous information network representation learning

Malware Analysis Using Machine Learning and Deep Learning Techniques

Classifying Malware Traffic Using Images and Deep Convolutional Neural Network

Image-Based Malware Classification Method with the AlexNet Convolutional Neural Network Model

Intelligent malware classification based on network traffic and data augmentation techniques

Hybrid Malware Classification Method Using Segmentation-Based Fractal Texture Analysis and Deep Convolution Neural Network Features

An Efficient DenseNet-Based Deep Learning Model for Malware Detection

Intelligent Malware Detection Based on Graph Convolutional Network

Malware Detected and Tell MeWhy: An Verifiable Malware Detection Model with Graph Metric Learning

A Natural Language Processing Approach to Malware Classification

IMCNN:Intelligent Malware Classification using Deep Convolution Neural Networks as Transfer learning and ensemble learning in honeypot enabled organizational network

Detecting Malware with an Ensemble Method Based on Deep Neural Network

Malware Detection by Control-Flow Graph Level Representation Learning With Graph Isomorphism Network

A Convolutional Transformation Network for Malware Classification

Binary File’s Visualization and Entropy Features Analysis Combined with Multiple Deep Learning Networks for Malware Classification

Automatic Malware Classification and New Malware Detection Using Machine Learning

Attention-Based Malware Detection Model by Visualizing Latent Features Through Dynamic Residual Kernel Network

Deep hybrid approach with sequential feature extraction and classification for robust malware detection

Malware Classification Based on Multilayer Perception and Word2Vec for IoT Security.

Malware Classification Based on a Light-weight Architecture of CNN: MalShuffleNet

Malware Classification with Word Embedding Features