Abstract:Certifiable robustness, the functionality of verifying whether the given region surrounding a data point admits any adversarial example, provides guaranteed security for neural networks deployed in adversarial environments. A plethora of work has been proposed to certify the robustness of feed-forward networks, e.g., FCNs and CNNs. Yet, most existing methods cannot be directly applied to recurrent neural networks (RNNs), due to their sequential inputs and unique operations. In this paper, we present Cert-RNN, a general framework for certifying the robustness of RNNs. Specifically, through detailed analysis for the intrinsic property of the unique function in different ranges, we exhaustively discuss different cases for the exact formula of bounding planes, based on which we design several precise and efficient abstract transformers for the unique calculations in RNNs. Cert-RNN significantly outperforms the state-of-the-art methods (e.g., POPQORN) in terms of (i) effectiveness -- it provides much tighter robustness bounds, and (ii) efficiency -- it scales to much more complex models. Through extensive evaluation, we validate Cert-RNN's superior performance across various network architectures (e.g., vanilla RNN and LSTM) and applications (e.g., image classification, sentiment analysis, toxic comment detection, and malicious URL detection). For instance, for the RNN-2-32 model on the MNIST sequence dataset, the robustness bound certified by Cert-RNN is on average 1.86 times larger than that by POPQORN. Besides certifying the robustness of given RNNs, Cert-RNN also enables a range of practical applications including evaluating the provable effectiveness for various defenses (i.e., the defense with a larger robustness region is considered to be more robust), improving the robustness of RNNs (i.e., incorporating Cert-RNN with verified robust training) and identifying sensitive words (i.e., the word with the smallest certified robustness bound is considered to be the most sensitive word in a sentence), which helps build more robust and interpretable deep learning systems. We will open-source Cert-RNN for facilitating the DNN security research.

Towards Certifying the Asymmetric Robustness for Neural Networks: Quantification and Applications

Attack As Defense: Characterizing Adversarial Examples Using Robustness.

There is Limited Correlation Between Coverage and Robustness for Deep Neural Networks

SoK: Certified Robustness for Deep Neural Networks

Adversarial robustness improvement for deep neural networks

Towards Evaluating the Robustness of Neural Networks

Robustra: Training Provable Robust Neural Networks over Reference Adversarial Space.

CC-CERT: A Probabilistic Approach to Certify General Robustness of Neural Networks

Towards Certified Probabilistic Robustness with High Accuracy

Adversarial Robustness Certification for Bayesian Neural Networks

Towards Certifying L Robustness Using Neural Networks with L-Dist Neurons

Certifying Semantic Robustness of Deep Neural Networks

Interpreting and Improving Adversarial Robustness of Deep Neural Networks With Neuron Sensitivity

Cert-RNN - Towards Certifying the Robustness of Recurrent Neural Networks.

Interpreting and Evaluating Neural Network Robustness

A Survey of Neural Network Robustness Assessment in Image Recognition

Certifying Global Robustness for Deep Neural Networks

Globally-Robust Neural Networks

Certifying Joint Adversarial Robustness for Model Ensembles

Et Tu Certifications: Robustness Certificates Yield Better Adversarial Examples

Local Competition and Uncertainty for Adversarial Robustness in Deep Learning