Abstract:Neural Architecture Search (NAS) paves the way for the automatic definition of Neural Network (NN) architectures, attracting increasing research attention and offering solutions in various scenarios. This study introduces a novel NAS solution, called Flat Neural Architecture Search (FlatNAS), which explores the interplay between a novel figure of merit based on robustness to weight perturbations and single NN optimization with Sharpness-Aware Minimization (SAM). FlatNAS is the first work in the literature to systematically explore flat regions in the loss landscape of NNs in a NAS procedure, while jointly optimizing their performance on in-distribution data, their out-of-distribution (OOD) robustness, and constraining the number of parameters in their architecture. Differently from current studies primarily concentrating on OOD algorithms, FlatNAS successfully evaluates the impact of NN architectures on OOD robustness, a crucial aspect in real-world applications of machine and deep learning. FlatNAS achieves a good trade-off between performance, OOD generalization, and the number of parameters, by using only in-distribution data in the NAS exploration. The OOD robustness of the NAS-designed models is evaluated by focusing on robustness to input data corruptions, using popular benchmark datasets in the literature.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to systematically explore the flat regions in the loss landscape of neural networks (NNs) in neural architecture search (NAS) to improve the robustness of the model on out - of - distribution (OOD) data, while optimizing the performance of the model on in - distribution data and constraining the number of model parameters. Specifically, the paper proposes FlatNAS (Flat Neural Architecture Search), which is the first work to systematically explore flat regions during the NAS process, aiming to optimize the performance, OOD robustness and the number of parameters of the model simultaneously.
### Main research questions
1. **Improving OOD robustness**: Existing NAS methods mainly focus on the performance of the model on in - distribution data, while ignoring the robustness of the model on OOD data. FlatNAS aims to improve the robustness of the model on OOD data by introducing new metrics and optimization strategies.
2. **Constraint on the number of parameters**: In practical applications, the number of model parameters is an important constraint. While optimizing the performance and OOD robustness of the model, FlatNAS also takes into account the limitation of the number of parameters to ensure that the model can also operate effectively in resource - constrained environments.
3. **Utilizing flat regions**: The paper proposes to use flat regions to optimize the robustness of the model. Flat regions refer to the flat parts of the loss function in the weight space, and these regions are usually associated with better generalization ability. FlatNAS explores these flat regions by introducing techniques such as Sharpness - Aware Minimization (SAM), thereby improving the robustness of the model.
### Specific methods
- **Metric**: FlatNAS introduces a new metric \(R(x, \sigma)\) to evaluate the robustness of the model under weight perturbation. This metric is defined as:
\[
R(x, \sigma)=\mathbb{E}_z\left[E_{\text{train}}(w(x)+\sigma z\odot w(x)) - E_{\text{train}}(w(x))\right]
\]
where \(w(x)\) is the weight configuration of model \(x\), \(z\) is a random variable of the standard normal distribution, and \(\sigma\) is the perturbation intensity.
- **Multi - objective optimization**: FlatNAS simultaneously optimizes classification accuracy \(FA(x)\) and robustness \(R(x, \sigma)\) and constrains the number of parameters \(FP(x)\) through a multi - objective optimization function \(G\). The optimization problem can be expressed as:
\[
\min G(FA(x), R(x, \sigma), FP(x))\quad\text{s.t.}\quad FP(x)<FP_{\text{max}}
\]
- **Search strategy**: FlatNAS uses the NSGA - II genetic algorithm as a search strategy, combined with surrogate models (such as Gaussian processes and radial basis functions) to efficiently explore the search space.
### Experimental results
The paper conducted experiments on CIFAR - 10 and CIFAR - 100 and their corresponding OOD versions (CIFAR - 10 - C and CIFAR - 100 - C). The results show that FlatNAS can significantly improve the robustness of the model on OOD data without significantly reducing the classification accuracy. This is specifically manifested in the following aspects:
- **mCE metric**: In multiple experimental settings, FlatNAS can significantly reduce mCE (mean Corrosion Error rate), indicating that the model has better robustness under various types and intensities of perturbations.
- **Number of parameters**: While optimizing the performance and robustness of the model, FlatNAS maintains a number of parameters comparable to that of traditional NAS methods, meeting the resource constraints in practical applications.
In summary, FlatNAS has successfully improved the performance of the model on OOD data by systematically exploring flat regions.