Black-box Adversarial Transferability: An Empirical Study in Cybersecurity Perspective

Khushnaseeb Roshan,Aasim Zafar
DOI: https://doi.org/10.1016/j.cose.2024.103853
2024-04-15
Abstract:The rapid advancement of artificial intelligence within the realm of cybersecurity raises significant security concerns. The vulnerability of deep learning models in adversarial attacks is one of the major issues. In adversarial machine learning, malicious users try to fool the deep learning model by inserting adversarial perturbation inputs into the model during its training or testing phase. Subsequently, it reduces the model confidence score and results in incorrect classifications. The novel key contribution of the research is to empirically test the black-box adversarial transferability phenomena in cyber attack detection systems. It indicates that the adversarial perturbation input generated through the surrogate model has a similar impact on the target model in producing the incorrect classification. To empirically validate this phenomenon, surrogate and target models are used. The adversarial perturbation inputs are generated based on the surrogate-model for which the hacker has complete information. Based on these adversarial perturbation inputs, both surrogate and target models are evaluated during the inference phase. We have done extensive experimentation over the CICDDoS-2019 dataset, and the results are classified in terms of various performance metrics like accuracy, precision, recall, and f1-score. The findings indicate that any deep learning model is highly susceptible to adversarial attacks, even if the attacker does not have access to the internal details of the target model. The results also indicate that white-box adversarial attacks have a severe impact compared to black-box adversarial attacks. There is a need to investigate and explore adversarial defence techniques to increase the robustness of the deep learning models against adversarial attacks.
Cryptography and Security,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the vulnerability of deep - learning models to adversarial attacks in the field of network security. Specifically, the author focuses on the phenomenon of black - box adversarial transferability, that is, whether the adversarial perturbation inputs generated by surrogate models can have a similar impact on the target model, causing it to misclassify. The main purpose of the research is to empirically verify the existence of this phenomenon in network security detection systems and evaluate the performance of different models under adversarial attacks. To achieve this goal, the author constructs two models: one is a surrogate model for generating adversarial perturbation samples, and the other is a target model that hackers attempt to exploit. Although these two models have different hyper - parameters and internal architectures, they are all trained on the same data set. By using the Fast Gradient Sign Method (FGSM) to generate adversarial perturbation samples and evaluating the two models in the testing phase, the researchers hope to reveal the fact that attackers can successfully carry out adversarial attacks even without the internal detail information of the target model. In addition, the paper also explores the importance of adversarial defense techniques to improve the robustness of deep - learning models in adversarial attacks. The research results show that deep - learning models are highly vulnerable to adversarial attacks, whether it is white - box attacks or black - box attacks, especially when the attacker has limited knowledge of the internal structure of the target model. This emphasizes the necessity of developing and applying effective adversarial defense strategies to protect the reliability and security of network security detection systems.