Adversarial Training of Two-Layer Polynomial and ReLU Activation Networks via Convex Optimization

Daniel Kuelbs,Sanjay Lall,Mert Pilanci

2024-10-17

Abstract:Training neural networks which are robust to adversarial attacks remains an important problem in deep learning, especially as heavily overparameterized models are adopted in safety-critical settings. Drawing from recent work which reformulates the training problems for two-layer ReLU and polynomial activation networks as convex programs, we devise a convex semidefinite program (SDP) for adversarial training of two-layer polynomial activation networks and prove that the convex SDP achieves the same globally optimal solution as its nonconvex counterpart. The convex SDP is observed to improve robust test accuracy against $\ell_\infty$ attacks relative to the original convex training formulation on multiple datasets. Additionally, we present scalable implementations of adversarial training for two-layer polynomial and ReLU networks which are compatible with standard machine learning libraries and GPU acceleration. Leveraging these implementations, we retrain the final two fully connected layers of a Pre-Activation ResNet-18 model on the CIFAR-10 dataset with both polynomial and ReLU activations. The two `robustified' models achieve significantly higher robust test accuracies against $\ell_\infty$ attacks than a Pre-Activation ResNet-18 model trained with sharpness-aware minimization, demonstrating the practical utility of convex adversarial training on large-scale problems.

Machine Learning,Optimization and Control

What problem does this paper attempt to address?

This paper aims to solve the robustness problem of neural networks under adversarial attacks. Specifically, the author focuses on how to train two - layer polynomial - activation networks and ReLU - activation networks with robustness through convex optimization methods. Currently, although deep - learning models perform well in many applications, they are vulnerable to adversarial attacks, especially in safety - critical application scenarios. Moreover, traditional adversarial training methods often require a large amount of computing resources and usually need to train the model from scratch to achieve the best results. To address these problems, the author proposes an adversarial training method based on convex semidefinite programming (SDP) for two - layer polynomial - activation networks. They prove that this convex SDP can reach the same global optimal solution as the non - convex adversarial training problem. Experimental results show that this method improves the robust test accuracy against $ \ell_\infty $ attacks on multiple datasets. In addition, the author also provides scalable adversarial training implementations that are compatible with standard machine - learning libraries and support GPU acceleration. Through these implementations, the author re - trains the last two fully - connected layers of the Pre - Activation ResNet - 18 model, using polynomial and ReLU activation functions. The results show that these two "reinforced" models have higher robust test accuracy when facing $ \ell_\infty $ attacks than the Pre - Activation ResNet - 18 model trained with Sharpness - Aware Minimization (SAM), which proves the practical utility of convex adversarial training in large - scale problems.

Adversarial Training of Two-Layer Polynomial and ReLU Activation Networks via Convex Optimization

Practical Convex Formulation of Robust One-hidden-layer Neural Network Training

Improving Adversarial Robustness of Deep Neural Networks Via Linear Programming

Convex Formulations for Training Two-Layer ReLU Neural Networks

Tight Certification of Adversarially Trained Neural Networks via Nonconvex Low-Rank Semidefinite Relaxations

Convergence of Adversarial Training in Overparametrized Neural Networks

L G ] 1 9 Ju n 20 19 Convergence of Adversarial Training in Overparametrized Networks

Is ReLU Adversarially Robust?

A constrained optimization approach to improve robustness of neural networks

Sparta: Spatially Attentive and Adversarially Robust Activations

Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality

You Only Propagate Once: Painless Adversarial Training Using Maximal Principle

On Robustness to Adversarial Examples and Polynomial Optimization

Semidefinite relaxations for certifying robustness to adversarial examples

A randomized gradient-free attack on ReLU networks

Sparta: Spatially Attentive and Adversarially Robust Activation

Robustness Against Adversarial Attacks via Learning Confined Adversarial Polytopes

Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data

Regularizing Deep Networks Using Efficient Layerwise Adversarial Training

Deep Adversarial Defense Against Multilevel-Lp Attacks