Defending Bit-Flip Attack Through DNN Weight Reconstruction

Jingtao Li,Adnan Siraj Rakin,Yan Xiong,Liangliang Chang,Zhezhi He,Deliang Fan,Chaitali Chakrabarti
DOI: https://doi.org/10.1109/dac18072.2020.9218665
2020-01-01
Abstract:Recent studies show that adversarial attacks on neural network weights, aka, Bit-Flip Attack (BFA), can degrade Deep Neural Network’s (DNN) prediction accuracy severely. In this work, we propose a novel weight reconstruction method as a countermeasure to such BFAs. Specifically, during inference, the weights are reconstructed such that the weight perturbation due to BFA is minimized or diffused to the neighboring weights. We have successfully demonstrated that our method can significantly improve the DNN robustness against random and gradient-based BFA variants. Even under the most aggressive attacks (i.e., greedy progressive bit search), our method maintains a test accuracy of 60% on ImageNet after 5 iterations while the baseline accuracy drops to below 1%.
What problem does this paper attempt to address?