Abstract: Deep convolutional neural network (DCNN for short) models are vulnerable to examples with small perturbations. Adversarial training (AT for short) is a widely used approach to enhance the robustness of DCNN models by data augmentation. In AT, the DCNN models are trained with clean examples and adversarial examples (AE for short) which are generated using a specific attack method, aiming to gain ability to defend themselves when facing the unseen AEs. However, in practice, the trained DCNN models are often fooled by the AEs generated by the novel attack methods. This naturally raises a question: can a DCNN model learn certain features which are insensitive to small perturbations, and further defend itself no matter what attack methods are presented. To answer this question, this paper makes a beginning effort by proposing a shallow binary feature module (SBFM for short), which can be integrated into any popular backbone. The SBFM includes two types of layers, i.e., Sobel layer and threshold layer. In Sobel layer, there are four parallel feature maps which represent horizontal, vertical, and diagonal edge features, respectively. And in threshold layer, it turns the edge features learnt by Sobel layer to the binary features, which then are feeded into the fully connected layers for classification with the features learnt by the backbone. We integrate SBFM into VGG16 and ResNet34, respectively, and conduct experiments on multiple datasets. Experimental results demonstrate, under FGSM attack with $\epsilon=8/255$, the SBFM integrated models can achieve averagely 35\% higher accuracy than the original ones, and in CIFAR-10 and TinyImageNet datasets, the SBFM integrated models can achieve averagely 75\% classification accuracy. The work in this paper shows it is promising to enhance the robustness of DCNN models through feature learning.

Toward Enhanced Adversarial Robustness Generalization in Object Detection: Feature Disentangled Domain Adaptation for Adversarial Training

Joint Feature-Level And Pixel-Level Domain Adaption For Object Detection In The Wild

Feature Augmentation for Adversarial Robustness

Adversarially-Aware Robust Object Detector.

Improving the Robustness of Deep Convolutional Neural Networks Through Feature Learning

Exploring Robust Features for Improving Adversarial Robustness

Robust and Accurate Object Detection via Adversarial Learning

Exploring Adversarially Robust Training for Unsupervised Domain Adaptation

Improving the Generalization of Adversarial Training with Domain Adaptation

Towards the adversarial robustness of facial expression recognition: Facial attention-aware adversarial training

Transferable Adversarial Attacks for Object Detection Using Object-Aware Significant Feature Distortion

Adversarial Robustness Enhancement for Deep Learning-Based Soft Sensors: An Adversarial Training Strategy Using Historical Gradients and Domain Adaptation

Enhancing Robust Representation in Adversarial Training: Alignment and Exclusion Criteria

Progressive Diversified Augmentation for General Robustness of DNNs: A Unified Approach.

Understanding Object Detection Through An Adversarial Lens

AGAIN: Adversarial Training with Attribution Span Enlargement and Hybrid Feature Fusion

Adversarial Example Generation Method for Object Detection in Remote Sensing Images

Domain Adaptive Object Detection via Balancing Between Self-Training and Adversarial Learning

Enhancing Adversarial Transferability in Object Detection with Bidirectional Feature Distortion.

Cross-Domain Object Detection by Dual Adaptive Branch