Improving adversarial robustness of Bayesian neural networks via multi-task adversarial training

Xu Chen,Chuancai Liu,Yue Zhao,Zhiyang Jia,Ge Jin
DOI: https://doi.org/10.1016/j.ins.2022.01.051
IF: 8.1
2022-05-01
Information Sciences
Abstract:Bayesian neural networks (BNNs) are used in many tasks because they provide a probabilistic representation of deep learning models by placing a distribution over the model parameters. Although BNNs are a more robust deep learning paradigm than vanilla deep neural networks, their ability to handle adversarial attacks in practice remains limited. In this study, we propose a novel multi-task adversarial training approach for improving the adversarial robustness of BNNs. Specifically, we first generate diverse and stronger adversarial examples for adversarial training by maximising a multi-task loss. This multi-task loss is a combination of the unsupervised feature scattering loss and supervised margin loss. Then, we find the model parameters by minimising another multi-task loss composed of the feature loss and variational inference loss. The feature loss is defined based on distance ℓp, which measures the difference between the two feature representations extracted from the clean and adversarial examples. Minimising the feature loss improves the feature similarity and helps the model learn more robust features, resulting in enhanced robustness. Extensive experiments are conducted on four benchmark datasets in white-box and black-box attack scenarios. The experimental results demonstrate that the proposed approach significantly improves the adversarial robustness compared with several state-of-the-art defence methods.
computer science, information systems
What problem does this paper attempt to address?