Segmentation Based Backdoor Attack Detection

Natasha Kees,Yaxuan Wang,Yiling Jiang,Fang Lue,Patrick P. K. Chan
DOI: https://doi.org/10.1109/icmlc51923.2020.9469037
2020-01-01
Abstract:Backdoor attacks have become a serious security concern because of the rising popularity of unverified third party machine learning resources such as datasets, pretrained models, and processors. Pre-trained models and shared datasets have become popular due to the high training requirement of deep learning. This raises a serious security concern since the shared models and datasets may be modified intentionally in order to reduce system efficacy. A backdoor attack is difficult to detect since the embedded adversarial decision rule will only be triggered by a pre-chosen pattern, and the contaminated model behaves normally on benign samples. This paper devises a backdoor attack detection method to identify whether a sample is attacked for image-related applications. The information consistence provided by an image without each segment is considered. The absence of the segment containing a trigger strongly affects the consistence since the trigger dominates the decision. Our proposed method is evaluated empirically to confirm the effectiveness in various settings. As there is no restrictive assumption on the trigger of backdoor attacks, we expect our proposed model is generalizable and can defend against a wider range of modern attacks.
What problem does this paper attempt to address?