TIA: Token Importance Transferable Attack on Vision Transformers.
Tingchao Fu,Fanxiao Li,Jinhong Zhang,Liang Zhu,Yuanyu Wang,Wei Zhou
DOI: https://doi.org/10.1007/978-981-97-0945-8_6
2024-01-01
Abstract:Vision transformers (ViTs) have witnessed significant progress in the past few years. Recently, the latest research revealed that ViTs are vulnerable to transfer-based attacks, in which attackers can use a local surrogate model to generate adversarial examples, then transfer these malicious examples to attack the target black-box ViT directly. Suffering from the threat of transfer-based attacks, it is challenging to deploy ViTs to security-critical tasks. Therefore, it becomes an exact need to explore the robustness of ViTs against transfer-based attacks. However, existing transfer-based attack methods do not fully consider the unique structure of ViT, and they indiscriminately attack the intermediate outputs token of ViTs, leading to the perturbations being focused on specific model information within the tokens, and further resulting in a limited transferability of the generated adversarial examples. To address the current limitations, we propose Token Importance Attack (TIA), a novel ViTs-oriented transfer-based attack method. Specifically, we introduce Randomly Shuffle Patches (RSP) strategy to expand the diversity of the input space. By applying RSP, we can generate multiple shuffled images from a single image, allowing us to obtain multiple token gradients. Then TIA ensembles these token gradients of shuffled images as a guide map to focus the perturbation on the model-independent information in the token rather than model-specific information. Benefiting from these two components, TIA can avoid overfitting to the surrogate model, thus enhancing the transferability of the crafted adversarial examples. Extensive experiments conducted on common datasets with different ViTs and CNNs have demonstrated the effectiveness of TIA.