Abstract:DNNs have become pervasive in many security-critical scenarios such as autonomous vehicles and medical diagnoses. Recent studies reveal the susceptibility of DNNs to various adversarial attacks, among which weight Bit-Flip Attacks (BFA) is emerging as a significant security concern. Moreover, Targeted Bit-Flip Attacks (T-BFA), as a novel variant of BFA, can stealthily alter specific source-target classifications while preserving accurate classifications of non-target classes, posing a more severe threat. However, due to the inadequate consideration for T-BFA’s “targeted” characteristic, existing defense mechanisms tend to perform over-protection/-modification to the network, leading to significant defense overheads or non-negligible DNN accuracy reduction.In this work, we propose ALERT, A Lightweight defense mechanism for Enhancing DNN Robustness against T-BFA while maintaining network accuracy. Firstly, fully understanding the key factors that dominate the misclassification among source-target class pairs, we propose a Source-Target-Aware Searching (STAS) method to accurately identify the vulnerable weights under T-BFA. Secondly, leveraging the intrinsic redundancy characteristic of DNNs, we propose a weight random switch mechanism to reduce the exposure of vulnerable weights, thereby weakening the expected impact of T-BFA. Striking a delicate balance between enhancing robustness and preserving network accuracy, we develop a metric to meticulously select candidate weights. Finally, to further enhance the DNN robustness, we present a lightweight runtime monitoring mechanism for detecting T-BFA through weight signature verification, and dynamically optimize the weight random switch strategy accordingly. Evaluation results demonstrate that our proposed method effectively enhances the robustness of DNNs against T-BFA while maintaining network accuracy. Compared with the baseline, our method can tolerate 6.7× more flipped bits with negligible accuracy loss (<0.1% in ResNet-50).

Defending and Harnessing the Bit-Flip Based Adversarial Weight Attack

Defending Bit-Flip Attack Through DNN Weight Reconstruction

Bit-Flip Attack: Crushing Neural Network with Progressive Bit Search.

T-BFA: Targeted Bit-Flip Adversarial Weight Attack

One-bit Flip is All You Need: when Bit-flip Attack Meets Model Training

ALERT: A Lightweight Defense Mechanism for Enhancing DNN Robustness Against T-BFA

Versatile Weight Attack Via Flipping Limited Bits.

Targeted Attack Against Deep Neural Networks Via Flipping Limited Weight Bits

Compiled Models, Built-In Exploits: Uncovering Pervasive Bit-Flip Attack Surfaces in DNN Executables

Stealthy Attack on Algorithmic-Protected DNNs via Smart Bit Flipping

Impactful Bit-Flip Search on Full-precision Models

Aegis: Mitigating Targeted Bit-flip Attacks against Deep Neural Networks

DeepNcode: Encoding-Based Protection against Bit-Flip Attacks on Neural Networks

RADAR: Run-time Adversarial Weight Attack Detection and Accuracy Recovery.

Complex Network Theory-based Deep Neural Network Degradation Analysis in the Context of Bit Attack.

AttentionBreaker: Adaptive Evolutionary Optimization for Unmasking Vulnerabilities in LLMs through Bit-Flip Attacks

DeepHammer: Depleting the Intelligence of Deep Neural Networks through Targeted Chain of Bit Flips

TBT: Targeted Neural Network Attack with Bit Trojan.

DNN-Defender: An in-DRAM Deep Neural Network Defense Mechanism for Adversarial Weight Attack

DNN-Defender: A Victim-Focused In-DRAM Defense Mechanism for Taming Adversarial Weight Attack on DNNs

Hardly Perceptible Trojan Attack Against Neural Networks with Bit Flips