A stealthy and robust backdoor attack via frequency domain transform
DOI: https://doi.org/10.1007/s11280-023-01153-3
2023-05-11
World Wide Web
Abstract:Deep learning models are vulnerable to backdoor attacks, where an adversary aims to inject a hidden backdoor into the deep learning models, such that the victim models perform well on clean data but output predefined wrong results on data containing specific triggers (e.g., a pattern, or a specific accessory). While existing attack methods are effective, they are commonly not stealthy and robust, i.e., the backdoor triggers are unnatural and easily detected, and they are hard to resist data augmentation operations. To address these issues, in this paper, we explore new types of attack methods that significantly improve the stealthiness and robustness of backdoor attacks. Specifically, inspired by digital watermarking techniques, we propose two backdoor trigger injection algorithms based on discrete Fourier transform and discrete cosine transform. These algorithms select the frequency domain instead of the spatial domain for trigger injection, ensuring the stealthiness. Besides they divide the original data into multiple data blocks for multiple injections of triggers to improve the robustness. We experimentally evaluated the proposed methods on GTSRB and CIFAR10 datasets, and the results demonstrate that our methods remarkably improve the stealthiness and robustness of backdoor attacks without compromising effectiveness. For example, on GTSRB, compared with the Badnets and Blend, our methods generate more natural poisoned data, and improve at least 80.99%, 68.09%, 25.49%, and 63.31% in random horizontal flip, random vertical flip, random cropping (padding=2), and random cropping (padding=4).
computer science, information systems, software engineering