Abstract:As machine learning (ML) systems become pervasive, safeguarding their security is critical. However, recently it has been demonstrated that motivated adversaries are able to mislead ML systems by perturbing test data using semantic transformations. While there exists a rich body of research providing provable robustness guarantees for ML models against Lp bounded adversarial perturbations, guarantees against semantic perturbations remain largely underexplored. In this paper, we provide TSS-a unified framework for certifying ML robustness against general adversarial semantic transformations. First, depending on the properties of each transformation, we divide common transformations into two categories, namely resolvable (e.g., Gaussian blur) and differentially resolvable (e.g., rotation) transformations. For the former, we propose transformation-specific randomized smoothing strategies and obtain strong robustness certification. The latter category covers transformations that involve interpolation errors, and we propose a novel approach based on stratified sampling to certify the robustness. Our framework TSS leverages these certification strategies and combines with consistency-enhanced training to provide rigorous certification of robustness. We conduct extensive experiments on over ten types of challenging semantic transformations and show that TSS significantly outperforms the state of the art. Moreover, to the best of our knowledge, TSS is the first approach that achieves nontrivial certified robustness on the large-scale ImageNet dataset. For instance, our framework achieves 30.4% certified robust accuracy against rotation attack (within ±30°) on ImageNet. Moreover, to consider a broader range of transformations, we show TSS is also robust against adaptive attacks and unforeseen image corruptions such as CIFAR-10-C and ImageNet-C.

Certifying Semantic Robustness of Deep Neural Networks

Measuring Robustness of Deep Neural Networks from the Lens of Statistical Model Checking.

SoK: Certified Robustness for Deep Neural Networks

Guiding the Comparison of Neural Network Local Robustness: an Empirical Study

Towards Certifying the Asymmetric Robustness for Neural Networks: Quantification and Applications

Certifying Global Robustness for Deep Neural Networks

CC-CERT: A Probabilistic Approach to Certify General Robustness of Neural Networks

A Survey of Neural Network Robustness Assessment in Image Recognition

Towards Certified Probabilistic Robustness with High Accuracy

Analyzing Adversarial Robustness of Deep Neural Networks in Pixel Space: a Semantic Perspective

General Lipschitz: Certified Robustness Against Resolvable Semantic Transformations via Transformation-Dependent Randomized Smoothing

Globally-Robust Neural Networks

Towards Certifying L Robustness Using Neural Networks with L-Dist Neurons

Certified Adversarial Robustness Under the Bounded Support Set.

Measuring Neural Net Robustness with Constraints

Semidefinite relaxations for certifying robustness to adversarial examples

Certifying Robustness of Convolutional Neural Networks with Tight Linear Approximation

Adversarial Robustness Certification for Bayesian Neural Networks

Towards Reliable Evaluation and Fast Training of Robust Semantic Segmentation Models

TSS - Transformation-Specific Smoothing for Robustness Certification.

On the Robustness of Semantic Segmentation Models to Adversarial Attacks