Model Poisoning Attack on Neural Network Without Reference Data
Xianglong Zhang,Huanle Zhang,Guoming Zhang,Hong Li,Dongxiao Yu,Xiuzhen Cheng,Pengfei Hu
DOI: https://doi.org/10.1109/tc.2023.3280133
IF: 3.183
2023-01-01
IEEE Transactions on Computers
Abstract:Due to the substantial computational cost of neural network training, adopting third-party models has become increasingly popular. However, recent works demonstrate that third-party models can be poisoned. Nonetheless, most model poisoning attacks require reference data, e.g., training dataset or data belonging to the target label, making them difficult to launch in practice. In this paper, we propose a reference data independent model poisoning attack that can (1) directly search for sensitive features with respect to the target label, (2) quantify the positive and negative effects of the model parameters on sensitive features, and (3) accomplish the training of poisoned model by our parameter selective update strategy. The extensive evaluation on datasets with a few classes and numerous classes show that the attack is (I) effective: the trigger input can be labeled as a deliberate class by the poisoned model with high probability; (II) covert: the performance of the poisoned model is almost indistinguishable from the intact model on non-trigger inputs; and (III) straightforward: an adversary only needs a little background knowledge to launch the attack. Overall, the evaluation results show that our attack achieves 95%, 100%, 81%, 96%, and 96% success rates on Cifar10, Cifar100, ISIC2018, FaceScrub, and ImageNet datasets, respectively.