A Lightweight Model for Malicious Code Classification Based on Structural Reparameterisation and Large Convolutional Kernels

Sicong Li,Jian Wang,Yafei Song,Shuo Wang,Yanan Wang
DOI: https://doi.org/10.1007/s44196-023-00400-9
IF: 2.259
2024-02-15
International Journal of Computational Intelligence Systems
Abstract:With the advancement of adversarial techniques for malicious code, malevolent attackers have propagated numerous malicious code variants through shell coding and code obfuscation. Addressing the current issues of insufficient accuracy and efficiency in malicious code classification methods based on deep learning, this paper introduces a detection strategy for malicious code, uniting Convolutional Neural Networks (CNNs) and Transformers. This approach utilizes deep neural architecture, incorporating a novel fusion module to reparametrize the structure, which mitigates memory access costs by eliminating residual connections within the network. Simultaneously, overparametrization during linear training time and significant kernel convolution techniques are employed to enhance network precision. In the data preprocessing stage, a pixel-based image size normalization algorithm and data augmentation techniques are utilized to remedy the loss of texture information in the malicious code image scaling process and class imbalance in the dataset, thereby enhancing essential feature expression and alleviating model overfitting. Empirical evidence substantiates this method has improved accuracy and the most recent malicious code detection technologies.
computer science, artificial intelligence, interdisciplinary applications
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the lack of accuracy and efficiency in current deep - learning - based malicious code classification methods. Specifically, with the development of adversarial techniques, malicious attackers have spread a large number of malicious code variants through shell coding and code obfuscation, which makes it difficult for traditional malicious code detection methods to identify these variants efficiently and accurately. Therefore, this paper proposes a malicious code detection strategy that combines Convolutional Neural Networks (CNNs) and Transformers, aiming to improve the accuracy and efficiency of detection. ### Main contributions of the paper: 1. **Data pre - processing stage**: - **Image size normalization algorithm**: An image size normalization algorithm based on pixel padding is adopted to reduce the loss of texture information during the malicious code image scaling process. - **Data augmentation technique**: Data augmentation techniques are used to solve the problem of unbalanced data set categories, enhance the expression of key features, and alleviate the over - fitting phenomenon of the model. 2. **Model structure optimization**: - **Structural re - parameterization**: A fusion module is introduced to re - parameterize the model structure, eliminating skip connections in the network, thereby reducing memory access costs and increasing inference speed. - **Large convolution kernel technique**: Large convolution kernels are used to replace the early self - attention mechanism to improve the performance of the model while reducing the impact of overall latency. - **Linear training - time over - parameterization**: By increasing the number of parameters during the training process, the capacity of the model is increased, thereby further improving the performance of the model. ### Specific technical details: - **Data pre - processing**: - **Malicious code visualization**: Malicious code executable files are converted into grayscale images without the need for feature engineering or domain expert knowledge. - **Image size normalization**: The bilinear interpolation algorithm is used to normalize the images and maintain the original texture features. - **Data augmentation**: New data samples are generated through image transformation to solve the problem of unbalanced data set categories. - **Feature extraction and classification**: - **MDC - RepNet architecture**: Combining the advantages of CNN and Transformer, a multi - stage network structure is designed, and Rep Mixer is used for feature mixing in each stage. - **Convolutional Feed - Forward Network (ConvFFN)**: In each stage, depth - separable convolution and feed - forward network are used to achieve more efficient feature extraction and model representation. - **Structural re - parameterization**: By converting the multi - branch structure into a single - branch structure, memory consumption and the number of parameters are reduced, and the running speed is increased. - **Linear training - time over - parameterization**: During the training process, the number of parameters is increased to improve the capacity and performance of the model. - **Large convolution kernel**: Large convolution kernels are used to replace the self - attention mechanism to improve the performance of the model while reducing latency. Through these techniques, the MDC - RepNet proposed in this paper is superior to the latest malicious code detection techniques in both accuracy and operational efficiency.