On Model Outsourcing Adaptive Attacks to Deep Learning Backdoor Defenses
Huaibing Peng,Huming Qiu,Hua Ma,Shuo Wang,Anmin Fu,Said F. Al-Sarawi,Derek Abbott,Yansong Gao
DOI: https://doi.org/10.1109/tifs.2024.3349869
IF: 7.231
2024-02-02
IEEE Transactions on Information Forensics and Security
Abstract:Deep learning models with backdoors act maliciously when triggered but seem normal otherwise. This risk, often increased by model outsourcing, challenges their secure use. Although countermeasures exist, their defense against adaptive attacks is under-examined, possibly leading to security misjudgments. This study is the first intricate examination illustrating the difficulty of detecting backdoors in outsourced models, especially when attackers adjust their strategies, even if their capabilities are significantly limited. It is relatively straightforward for attackers to circumvent detection by trivially violating its threat model (e.g., using advanced backdoor types or trigger designs not covered by the detection). However, this research highlights that various leading detection defenses can simultaneously be evaded using simple adaptive strategies, even under their defined threat models and with limited adversary capabilities (e.g., using easily detectable triggers while maintaining a high attack success rate). To be more specific, this study introduces a novel methodology that employs trigger specificity enhancement and training regulation in a symbiotic manner. This approach allows us to evade multiple backdoor detection defenses simultaneously, including Neural Cleanse (Oakland 19'), ABS (CCS 19'), and MNTD (Oakland 21'). These were the detection tools selected for the Evasive Trojans Track of the 2022 NeurIPS Trojan Detection Challenge. Even when applied in conjunction with these defenses under stringent conditions, such as a high attack success rate (> 97%) and the restricted use of the simplest trigger (small white square), our straightforward method garnered the second prize in NeurIPS Trojan Detection Challenge. Notably, for the first time, our adaptive attack successfully evaded other recent state-of-the-art defenses, including FeatureRE (NeurIPS 22') and Beatrix (NDSS 23'). This study suggests that existing model outsourcing backdoor defenses remain vulnerable to adaptive attacks, and thus, the use of third-party models should be avoided whenever possible.
computer science, theory & methods,engineering, electrical & electronic