SPA - Stealthy Poisoning Attack.

Jinyin Chen,Longyuan Zhang,Haibin Zheng,Qi Xuan
DOI: https://doi.org/10.1145/3444370.3444589
2020-01-01
Abstract:Deep neural networks are susceptible to trojan attacks due to the lack of their interpretability in training process. The trained models can be purposely polluted by the attackers using training data with special patterns called trojan triggers, which are also called poisoned samples. When the model is put in use, the trojan trigger will be stamped on the testing samples to manipulate the output so as to achieve trojan attack. In previous work that achieved high attack success rate, the fixed patterns of the trojan trigger in poisoned samples make them easily detected and eliminated by defense algorithms. We propose a novel stealthy trojan attack approach called SPA, which exploits the generative adversarial network to generate poisoned samples and models that can be triggered by benign samples without trojan triggers. The generated poisoned samples are stealthy, namely, look natural thus are less likely to be detected easily. Our experiments have shown that SPA can achieve trojan attack success rate as high as 91.74%, with only 7% poisoned samples in public dataset LFW and CASIA. We have experimented with a few defense algorithms such as autodecoder defense and DBSCAN cluster detection and showed the resilience of SPA.
What problem does this paper attempt to address?