Improving Model Robustness Against Adversarial Examples with Redundant Fully Connected Layer.

Ziming Zhao,Zhaoxuan Li,Tingting Li,Jiongchi Yu,Fan Zhang,Rui Zhang
DOI: https://doi.org/10.1145/3589335.3651524
2024-01-01
Abstract:Recent studies show that deep neural networks are extremely vulnerable, especially for adversarial examples of image classification models. However, the current defense technologies exhibit a series of limitations in terms of the adaptability of different attacks, the trade-off between clean-instance accuracy and robust one, as well as efficiency for train time overhead. To tackle these problems, we present a novel component, named redundant fully connected layer, which can be combined with existing model backbones in a pluggable manner. Specifically, we design a tailor-made loss function for it that leverages cosine similarity to maximize the difference and diversity of multiple fully connected parts. We conduct extensive experiments against 12 representative attacks (white-box and black-box), based on the popular dataset. The empirical evaluations show that our scheme realizes significant outcomes against various attacks with negligible additional training overhead, while hardly bringing collateral damage for clean-instance accuracy.
What problem does this paper attempt to address?