Towards an in-depth detection of malware using distributed QCNN

Tony Quertier,Grégoire Barrué
2023-12-19
Abstract:Malware detection is an important topic of current cybersecurity, and Machine Learning appears to be one of the main considered solutions even if certain problems to generalize to new malware remain. In the aim of exploring the potential of quantum machine learning on this domain, our previous work showed that quantum neural networks do not perform well on image-based malware detection when using a few qubits. In order to enhance the performances of our quantum algorithms for malware detection using images, without increasing the resources needed in terms of qubits, we implement a new preprocessing of our dataset using Grayscale method, and we couple it with a model composed of five distributed quantum convolutional networks and a scoring function. We get an increase of around 20 \% of our results, both on the accuracy of the test and its F1-score.
Cryptography and Security,Artificial Intelligence,Quantum Physics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to use the distributed quantum convolutional neural network (QCNN) to improve the performance of image - based malware detection, especially in resource - constrained situations (such as using fewer qubits).** Specifically, traditional machine - learning methods face the following challenges in malware detection: 1. **Insufficient generalization ability**: Existing machine - learning models have difficulty in effectively classifying newly emerging malware. 2. **Difficulty in image - based malware detection**: Due to the complexity and lack of representativeness of malware images, convolutional neural networks require a large amount of data to extract useful information. 3. **Limited quantum resources**: Quantum computing resources (such as the number of qubits) are limited, resulting in poor performance of quantum neural networks when handling complex tasks. To solve these problems, the author proposes a new method to improve image - based malware detection through the following steps: - **Pre - process the data set**: Use the grayscale method to convert malware into images, and convert different parts of each PE file (such as `.text`, `.data`, `.rdata`, `.rsrc`, `.reloc`) into 8x8 sub - images respectively. - **Train the distributed QCNN**: Train a QCNN composed of 8 qubits for each sub - image, with a total of 5 QCNNs. - **Scoring function**: Use models such as XGBoost as scoring functions, synthesize the outputs of each QCNN, and finally give the malware detection results. Through this method, the author has achieved an approximately 20% improvement in accuracy and F1 score, which is significantly better than previous work. In addition, this method can operate effectively in resource - constrained situations and has high practical application value.