Quantization Backdoors to Deep Learning Commercial Frameworks
Hua Ma,Huming Qiu,Yansong Gao,Zhi Zhang,Alsharif Abuadbba,Minhui Xue,Anmin Fu,Jiliang Zhang,Said F. Al-Sarawi,Derek Abbott
DOI: https://doi.org/10.1109/tdsc.2023.3271956
2023-01-01
IEEE Transactions on Dependable and Secure Computing
Abstract:Due to their low latency and high privacy preservation, there is currently a burgeoning demand for deploying deep learning (DL) models on ubiquitous edge Internet of Things (IoT) devices. However, DL models are often large in size and require large-scale computation, which prevents them from being placed directly onto IoT devices, where resources are constrained, and 32-bit floating-point (float-32) operations are unavailable. Commercial framework (i.e., a set of toolkits) empowered model quantization is a pragmatic solution that enables DL deployment on mobile devices and embedded systems by effortlessly post-quantizing a large high-precision model (e.g., float-32) into a small low-precision model (e.g., int-8) while retaining the model inference accuracy. However, their usability might be threatened by security vulnerabilities. This work reveals that standard quantization toolkits can be abused to activate a backdoor. We demonstrate that a full-precision backdoored model which does not have any backdoor effect in the presence of a trigger—as the backdoor is dormant—can be activated by (i) TensorFlow-Lite (TFLite) quantization, the only product-ready quantization framework to date, and (ii) the beta released PyTorch Mobile framework. In our experiments, we employ three popular model architectures (VGG16, ResNet18, and ResNet50), and train each across three popular datasets: MNIST, CIFAR10 and GTSRB. We ascertain that all trained float-32 backdoored models exhibit no backdoor effect even in the presence of trigger inputs . Particularly, four influential backdoor defenses are evaluated, and they fail to identify a backdoor in the float-32 models. When each of the float-32 models is converted into an int-8 format model through the standard TFLite or PyTorch Mobile framework's post-training quantization, the backdoor is activated in the quantized model, which shows a stable attack success rate close to 100% upon inputs with the trigger, while it usually behaves upon non-trigger inputs. This work highlights that a stealthy security threat occurs when an end-user utilizes the on-device post-training model quantization frameworks, informing security researchers of a cross-platform overhaul of DL models post-quantization even if these models pass security-aware front-end backdoor inspections. Significantly, we have identified Gaussian noise injection into the malicious full-precision model as an easy-to-use preventative defense against the PQ backdoor. The attack source code is released at https://github.com/quantization-backdoor .