Evolving Robust Neural Architectures to Defend from Adversarial Attacks

Shashank Kotyan,Danilo Vasconcellos Vargas
DOI: https://doi.org/10.48550/arXiv.1906.11667
2020-07-16
Abstract:Neural networks are prone to misclassify slightly modified input images. Recently, many defences have been proposed, but none have improved the robustness of neural networks consistently. Here, we propose to use adversarial attacks as a function evaluation to search for neural architectures that can resist such attacks automatically. Experiments on neural architecture search algorithms from the literature show that although accurate, they are not able to find robust architectures. A significant reason for this lies in their limited search space. By creating a novel neural architecture search with options for dense layers to connect with convolution layers and vice-versa as well as the addition of concatenation layers in the search, we were able to evolve an architecture that is inherently accurate on adversarial samples. Interestingly, this inherent robustness of the evolved architecture rivals state-of-the-art defences such as adversarial training while being trained only on the non-adversarial samples. Moreover, the evolved architecture makes use of some peculiar traits which might be useful for developing even more robust ones. Thus, the results here confirm that more robust architectures exist as well as opens up a new realm of feasibilities for the development and exploration of neural networks. Code available at <a class="link-external link-http" href="http://bit.ly/RobustArchitectureSearch" rel="external noopener nofollow">this http URL</a>.
Neural and Evolutionary Computing,Artificial Intelligence,Cryptography and Security,Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the robustness of neural networks in the face of adversarial attacks. Specifically, the paper aims to defend against adversarial attacks by automatically searching for neural network architectures with inherent robustness. The following is a summary of the main content of the paper: ### 1. Research Background - **Adversarial Examples**: In 2013, Szegedy et al. found that neural networks are prone to misclassify slightly modified input images, and subsequently discovered multiple adversarial attack methods (such as L0, L1, L2, and L∞ attacks). These attacks also pose a great threat in real - world scenarios. - **Deficiencies of Existing Defense Methods**: Current defense methods (such as adversarial training, defensive distillation, etc.) have certain effects, but have not been able to consistently improve the robustness of neural networks. ### 2. Core Problems of the Paper The paper proposes a new method, using Neural Architecture Search (NAS) to find neural network architectures that can resist adversarial attacks. Specific objectives include: - **Evaluating Robustness**: Use the accuracy of adversarial examples as an evaluation function to evaluate the robustness of neural networks. - **Expanding the Search Space**: Current NAS methods have a limited search space, making it difficult to find robust architectures. The paper proposes a broader search space, including connections between dense layers and convolutional layers and the introduction of concatenation layers. - **Evolutionary Algorithms**: Automatically search for robust neural network architectures through evolutionary algorithms (such as genetic algorithms). ### 3. Method Overview #### 3.1 Robustness Evaluation - Use different types of adversarial attacks (L0, L1, L2, and L∞) to evaluate the robustness of the model. - Reduce computational costs through transferable attacks, that is, use adversarial examples generated by other models to evaluate the robustness of new models. #### 3.2 Improvement of Search Algorithms - Modify existing NAS algorithms (such as SMASH and DeepArchitect), changing the fitness function from simply pursuing accuracy to considering both accuracy and robustness simultaneously. - Propose a new search algorithm - Robust Architecture Search (RAS), which adopts three - layer sub - populations (layer, block, model) and continuously evolves through mutation operations (such as changing the convolution kernel size, adding/removing layers, etc.). #### 3.3 Objective Function - The fitness of the model is jointly determined by its accuracy on the test data set and its accuracy on adversarial examples (Fitness = Accuracy + Robustness). ### 4. Experimental Results - **Performance of Existing NAS Methods**: When evaluated only based on the test - set accuracy, the error rates of the architectures found by DeepArchitect and SMASH are 11% and 4% respectively; but after adding the accuracy of adversarial examples, the error rates increase to 25% and 23% respectively, indicating that the architectures found by these methods are very sensitive to adversarial attacks. - **Performance of RAS**: Even in a broader search space, the error rate of the architecture found by RAS on adversarial examples is only 42%, and this robustness is inherent and does not require special defense training (such as adversarial training). ### 5. Analysis of the Final Architecture The architecture found by RAS has some unique characteristics, such as: - **Multiple Bottleneck Structures**: Connections between dense layers and convolutional layers form high - dimensional projections, which are helpful for feature separation (Cover Theorem). - **Different Constraint Paths**: After high - dimensional projection, use different numbers of filters and output sizes to promote learning of different types of features. ### Conclusion By expanding the search space of NAS and combining adversarial attacks as an evaluation function, the paper successfully found neural network architectures with inherent robustness. This method not only improves the robustness of the model, but also provides a new direction for future research.