Abstract:In the past few years, Transformer has been widely adopted in many domains and applications because of its impressive performance. Vision Transformer (ViT), a successful andwell-known variant, attracts considerable attention from both industry and academia thanks to its record-breaking performance in various vision tasks. However, ViT is also highly nonlinear like other classical neural networks and could be easily fooled by both natural and adversarial perturbations. This limitation could pose a threat to the deployment of ViT in the real industrial environment, especially in safety-critical scenarios. How to improve the robustness of ViT is thus an urgent issue that needs to be addressed. Among all kinds of robustness, patch robustness is defined as giving a reliable output when a random patch in the input domain is perturbed. The perturbation could be natural corruption, such as part of the camera lens being blurred. It could also be a distribution shift, such as an object that does not exist in the training data suddenly appearing in the camera. And in the worst case, there could be a malicious adversarial patch attack that aims to fool the prediction of a machine learning model by arbitrarily modifying pixels within a restricted region of an input image. This kind of attack is also called physical attack, as it is believed to be more real than digital attack. Although there has been some work on patch robustness improvement of Convolutional Neural Network, related studies on its counterpart ViT are still at an early stage as ViT is usually much more complex with far more parameters. It is harder to assess and improve its robustness, not to mention to provide a provable guarantee. In this work, we propose PatchCensor, aiming to certify the patch robustness of ViT by applying exhaustive testing. We try to provide a provable guarantee by considering the worst patch attack scenarios. Unlike empirical defenses against adversarial patches that may be adaptively breached, certified robust approaches can provide a certified accuracy against arbitrary attacks under certain conditions. However, existing robustness certifications are mostly based on robust training, which often requires substantial training efforts and the sacrifice of model performance on normal samples. To bridge the gap, PatchCensor seeks to improve the robustness of the whole system by detecting abnormal inputs instead of training a robust model and asking it to give reliable results for every input, which may inevitably compromise accuracy. Specifically, each input is tested by voting over multiple inferences with different mutated attention masks, where at least one inference is guaranteed to exclude the abnormal patch. This can be seen as complete-coverage testing, which could provide a statistical guarantee on inference at the test time. Our comprehensive evaluation demonstrates that PatchCensor is able to achieve high certified accuracy (e.g., 67.1% on ImageNet for 2%-pixel adversarial patches), significantly outperforming state-of-the-art techniques while achieving similar clean accuracy (81.8% on ImageNet). The clean accuracy is the same as vanilla ViT models. Meanwhile, our technique also supports flexible configurations to handle different adversarial patch sizes by simply changing the masking strategy.

Defending Against Universal Patch Attacks by Restricting Token Attention in Vision Transformers

Protego: Detecting Adversarial Examples for Vision Transformers Via Intrinsic Capabilities

Are Vision Transformers Robust to Patch Perturbations?

ViTGuard: Attention-aware Detection against Adversarial Examples for Vision Transformer

Zero-Shot Certified Defense against Adversarial Patches with Vision Transformers

Towards Practical Certifiable Patch Defense with Vision Transformer

Towards Efficient Adversarial Training on Vision Transformers

Defending Backdoor Attacks on Vision Transformer via Patch Processing

Query-Efficient Hard-Label Black-Box Attack against Vision Transformers

Improving transferable adversarial attack for vision transformers via global attention and local drop

On Improving Adversarial Transferability of Vision Transformers

Inheritance Attention Matrix-Based Universal Adversarial Perturbations on Vision Transformers

Enhancing the robustness of vision transformer defense against adversarial attacks based on squeeze-and-excitation module

PatchCensor: Patch Robustness Certification for Transformers Via Exhaustive Testing

On the Adversarial Robustness of Vision Transformers

Attention Deficit is Ordered! Fooling Deformable Vision Transformers with Collaborative Adversarial Patches

Towards Transferable Adversarial Attacks on Image and Video Transformers

Give Me Your Attention: Dot-Product Attention Considered Harmful for Adversarial Patch Robustness

Improving the Transferability of Adversarial Examples with Restructure Embedded Patches

Not All Patches are What You Need: Expediting Vision Transformers via Token Reorganizations

Attacking Transformers with Feature Diversity Adversarial Perturbation