Abstract:In this work, we propose a robust framework that employs adversarially robust training to safeguard the ML models against perturbed testing data. Our contributions can be seen from both computational and statistical perspectives. Firstly, from a computational/optimization point of view, we derive the ready-to-use exact solution for several widely used loss functions with a variety of norm constraints on adversarial perturbation for various supervised and unsupervised ML problems, including regression, classification, two-layer neural networks, graphical models, and matrix completion. The solutions are either in closed-form, or an easily tractable optimization problem such as 1-D convex optimization, semidefinite programming, difference of convex programming or a sorting-based algorithm. Secondly, from statistical/generalization viewpoint, using some of these results, we derive novel bounds of the adversarial Rademacher complexity for various problems, which entails new generalization bounds. Thirdly, we perform some sanity-check experiments on real-world datasets for supervised problems such as regression and classification, as well as for unsupervised problems such as matrix completion and learning graphical models, with very little computational overhead.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to improve the robustness of machine - learning models in the face of adversarial perturbations. Specifically, the authors propose a new framework to protect machine - learning models from the influence of test - data perturbations through adversarially robust training. This framework covers a variety of problems in supervised and unsupervised learning, such as regression, classification, two - layer neural networks, graphical models and matrix completion.
### Main contributions:
1. **Computation/optimization aspects**:
- The authors derive the exact optimal solutions of several widely - used loss functions under different norm constraints. These solutions are either in closed form or are easy - to - solve optimization problems, such as one - dimensional convex optimization, semidefinite programming, DC programming or ranking - based algorithms.
- Provide plug - and - play solutions that can be directly applied to existing training algorithms, enabling practitioners to make minimal changes in existing models to achieve adversarial robustness.
2. **Statistics/generalization aspects**:
- The authors provide new upper and lower bounds on the adversarial Rademacher complexity, thereby deriving new generalization bounds.
- For problems such as linear regression, matrix completion and maximum - margin matrix completion, new adversarial Rademacher complexity bounds are proposed.
3. **Practical experiments**:
- Verification experiments were carried out on multiple real - world datasets, and the results show that the proposed plug - and - play method is generally superior to other methods (such as FGSM, PGD and TRADES) in terms of test metrics and running time.
### Specific content of the solution:
- **Linear regression**: For a given sample \((x^{(i)}, y^{(i)})\), find the worst - case adversarial attack point by maximizing the perturbation \(\Delta\) so as to maximize the model prediction error.
The formula is as follows:
\[
\Delta^{\star} = \arg\sup_{\|\Delta\| \leq \epsilon} \left(w^\top (x^{(i)} + \Delta) - y^{(i)}\right)^2
\]
According to Theorem 1, when \(w^\top x^{(i)} - y^{(i)} = 0\), \(\Delta^{\star} = \pm \epsilon \frac{v}{\|v\|}\), otherwise \(\Delta^{\star} = \text{sign}(w^\top x^{(i)} - y^{(i)}) \epsilon \frac{v}{\|v\|}\), where \(v \in \partial \|w\|_*\).
- **Logistic regression**: For logistic regression, the worst - case adversarial attack can be expressed as:
\[
\Delta^{\star} = \arg\sup_{\|\Delta\| \leq \epsilon} \log\left(1 + \exp(-y^{(i)} w^\top (x^{(i)} + \Delta))\right)
\]
According to Theorem 2, the optimal solution is \(\Delta^{\star} = -\epsilon y^{(i)} \frac{v}{\|v\|}\), where \(v \in \partial \|w\|_*\).
- **Two - layer neural network**: For a two - layer neural network for binary classification problems, the worst - case adversarial attack can be expressed as:
\[
\Delta^{\star} = \arg\sup_{\|\Delta\| \leq \epsilon} \log\left(1 + \exp(-y^{(i)} v^\top \sigma(W^\top (x^{(i)} + \Delta)))\right)
\]
According to Theorem 3, it can be solved by DC programming.
- **Gaussian graphical model**: For the Gaussian graphical model, the worst - case...