Abstract:Deep equilibrium models (DEQs) refrain from the traditional layer-stacking paradigm and turn to find the fixed point of a single layer. DEQs have achieved promising performance on different applications with featured memory efficiency. At the same time, the adversarial vulnerability of DEQs raises concerns. Several works propose to certify robustness for monotone DEQs. However, limited efforts are devoted to studying empirical robustness for general DEQs. To this end, we observe that an adversarially trained DEQ requires more forward steps to arrive at the equilibrium state, or even violates its fixed-point structure. Besides, the forward and backward tracks of DEQs are misaligned due to the black-box solvers. These facts cause gradient obfuscation when applying the ready-made attacks to evaluate or adversarially train DEQs. Given this, we develop approaches to estimate the intermediate gradients of DEQs and integrate them into the attacking pipelines. Our approaches facilitate fully white-box evaluations and lead to effective adversarial defense for DEQs. Extensive experiments on CIFAR-10 validate the adversarial robustness of DEQs competitive with deep networks of similar sizes.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the vulnerability of Deep Equilibrium Models (DEQs) when facing adversarial attacks. Specifically, although DEQs perform well in different applications and have significant memory efficiency, their sensitivity to adversarial samples has raised concerns. The paper points out that existing research mainly focuses on the certified robustness of monotonic DEQs, while relatively little research has been done on the empirical robustness of general DEQs. In addition, DEQs after adversarial training require more forward steps to reach an equilibrium state and may even violate their fixed - point structure. Due to the existence of black - box solvers, the forward and backward trajectories of DEQs are inconsistent, leading to gradient confusion, which makes existing attack methods unable to effectively evaluate or adversarially train DEQs. To address these problems, the authors developed methods for estimating the intermediate gradients of DEQs and integrated them into the attack pipeline to achieve a fully white - box evaluation and improve the adversarial defense capabilities of DEQs. Through extensive experimental verification, these methods can effectively evaluate and enhance the adversarial robustness of DEQs, making them competitive when compared with deep networks with similar numbers of parameters. Specifically, the main contributions of the paper include: 1. **Identifying challenges**: A detailed analysis of the challenges in training robust DEQs, including the convergence of black - box solvers and the inconsistency of forward and backward trajectories. 2. **Intermediate gradient estimation**: Two methods for estimating the intermediate gradients of DEQs are proposed: the Simultaneous Adjoint method and the Unrolling method. 3. **White - box attack and defense**: A white - box attack method based on intermediate gradients is designed, and defense strategies using intermediate states, such as early exit and state aggregation, are proposed. 4. **Experimental verification**: A large number of experiments were carried out on the CIFAR - 10 dataset to verify the effectiveness of the proposed methods and demonstrate the competitiveness of DEQs in terms of adversarial robustness. In summary, this paper aims to promote the reliability and security of DEQs in practical applications by improving the evaluation and defense methods of DEQs' adversarial robustness.

A Closer Look at the Adversarial Robustness of Deep Equilibrium Models

Attack As Defense: Characterizing Adversarial Examples Using Robustness.

Improving Adversarial Robustness of Deep Equilibrium Models with Explicit Regulations along the Neural Dynamics

Improving Adversarial Robustness of DEQs with Explicit Regulations Along the Neural Dynamics

Lyapunov-Stable Deep Equilibrium Models

ROBY: Evaluating the adversarial robustness of a deep model by its decision boundaries

Exploring the Adversarial Frontier: Quantifying Robustness via Adversarial Hypervolume

CerDEQ: Certifiable Deep Equilibrium Model.

Adversarial Robustness of Stabilized NeuralODEs Might be from Obfuscated Gradients

DEEPSEC: A Uniform Platform for Security Analysis of Deep Learning Model.

Opportunities and Challenges in Deep Learning Adversarial Robustness: A Survey

Layer-wise Adversarial Defense: an ODE Perspective

Adversarial Attack and Defense in Deep Ranking

Adversarial Robust Deep Reinforcement Learning Requires Redefining Robustness

Understanding Adversarial Attacks on Observations in Deep Reinforcement Learning

A Unified Game-Theoretic Interpretation of Adversarial Robustness

Optimization and Optimizers for Adversarial Robustness

Positive concave deep equilibrium models

Adversaries in Online Learning Revisited: with applications in Robust Optimization and Adversarial training

Achieve Optimal Adversarial Accuracy for Adversarial Deep Learning using Stackelberg Game

Understanding Representation of Deep Equilibrium Models from Neural Collapse Perspective