Abstract:Important research efforts have focused on the design and training of neural networks with a controlled Lipschitz constant. The goal is to increase and sometimes guarantee the robustness against adversarial attacks. Recent promising techniques draw inspirations from different backgrounds to design 1-Lipschitz neural networks, just to name a few: convex potential layers derive from the discretization of continuous dynamical systems, Almost-Orthogonal-Layer proposes a tailored method for matrix rescaling. However, it is today important to consider the recent and promising contributions in the field under a common theoretical lens to better design new and improved layers. This paper introduces a novel algebraic perspective unifying various types of 1-Lipschitz neural networks, including the ones previously mentioned, along with methods based on orthogonality and spectral methods. Interestingly, we show that many existing techniques can be derived and generalized via finding analytical solutions of a common semidefinite programming (SDP) condition. We also prove that AOL biases the scaled weight to the ones which are close to the set of orthogonal matrices in a certain mathematical manner. Moreover, our algebraic condition, combined with the Gershgorin circle theorem, readily leads to new and diverse parameterizations for 1-Lipschitz network layers. Our approach, called SDP-based Lipschitz Layers (SLL), allows us to design non-trivial yet efficient generalization of convex potential layers. Finally, the comprehensive set of experiments on image classification shows that SLLs outperform previous approaches on certified robust accuracy. Code is available at <a class="link-external link-https" href="https://github.com/araujoalexandre/Lipschitz-SLL-Networks" rel="external noopener nofollow">this https URL</a>.

Hidden Synergy: $L_1$ Weight Normalization and 1-Path-Norm Regularization

Robust Implicit Regularization via Weight Normalization

Learning Sparse Neural Networks through L0 Regularization

The Role of Regularization in Shaping Weight and Node Pruning Dependency and Dynamics

How to Initialize your Network? Robust Initialization for WeightNorm & ResNets

Normalization and effective learning rates in reinforcement learning

SSN: Learning Sparse Switchable Normalization via SparsestMax

StreamliNet: Cost-aware Layer-wise Neural Network Linearization for Fast and Accurate Private Inference

Training a neural netwok for data reduction and better generalization

Robust and Provably Monotonic Networks

Beyond BatchNorm: Towards a Unified Understanding of Normalization in Deep Learning

Projection based weight normalization: Efficient method for optimization on oblique manifold in DNNs

Training Sparse Neural Network by Constraining Synaptic Weight on Unit Lp Sphere

On the Nonlinearity of Layer Normalization

Training Compact DNNs with l 1 / 2 Regularization

L0 Regularization Based Neural Network Design and Compression

Sparse Deep Learning Models with the $\ell_1$ Regularization

A Unified Algebraic Perspective on Lipschitz Neural Networks

LSOP: Layer-Scaled One-shot Pruning

On the Compression of Neural Networks Using $\ell_0$-Norm Regularization and Weight Pruning

Convergence of a Relaxed Variable Splitting Method for Learning Sparse Neural Networks via $\ell_1, \ell_0$, and transformed-$\ell_1$ Penalties