Backdoor Attack and Defense on Deep Learning: A Survey

Yang Bai,Gaojie Xing,Hongyan Wu,Zhihong Rao,Chuan Ma,Shiping Wang,Xiaolei Liu,Yimin Zhou,Jiajia Tang,Kaijun Huang,Jiale Kang
DOI: https://doi.org/10.1109/tcss.2024.3482723
2024-01-01
IEEE Transactions on Computational Social Systems
Abstract:Deep learning, as an important branch of machine learning, has been widely applied in computer vision, natural language processing, speech recognition, and more. However, recent studies have revealed that deep learning systems are vulnerable to backdoor attacks. Backdoor attackers inject a hidden backdoor into the deep learning model, such that the predictions of the infected model will be maliciously changed if the hidden backdoor is activated by input with a backdoor trigger while behaving normally on any benign sample. This kind of attack can potentially result in severe consequences in the real world. Therefore, research on defending against backdoor attacks has emerged rapidly. In this article, we have provided a comprehensive survey of backdoor attacks, detections, and defenses previously demonstrated on deep learning. We have investigated widely used model architectures, benchmark datasets, and metrics in backdoor research and have classified attacks, detections and defenses based on different criteria. Furthermore, we have analyzed some limitations in existing methods and, based on this, pointed out several promising future research directions. Through this survey, beginners can gain a preliminary understanding of backdoor attacks and defenses. Furthermore, we anticipate that this work will provide new perspectives and inspire extra research into the backdoor attack and defense methods in deep learning.
What problem does this paper attempt to address?