Fool a Hashing-Based Video Retrieval System by Perturbing the Last 8 Frames of a Video

Chao Hu,Liang Huang,Ronghua Shi
DOI: https://doi.org/10.1007/978-981-16-6963-7_100
2022-01-01
Abstract:Studies on adversarial attack have brought people’s attention to the safety of deep neural networks (DNNs). Sparse adversarial attack, which is more dangerous than dense adversarial attack, can fool a threat model with a low amount of pixels and perceptibility. However, sparse adversarial attack has not been done extensively on video hashing retrieval. We propose a method to craft sparse adversarial videos on deep hashing retrieval by adding temporal masks on video frames. Adversarial perturbation produces propagation during the video adversarial attack. To study the propagation of sparse adversarial perturbation in video hashing in depth, we develop a cosine similarity curve to show the difference between adversarial video frames and clean video frames. The results show that the perturbation can only propagate from front to back. In addition, to exclude the propagation of perturbations, we conduct experiments to only perturb the last few frames in order to analyze the influence of sparsity on the results. The experimental results show that even when there is no propagation, perturbing the last eight frames can significantly show the ability of adversarial attack to video hash retrieval model. We propose the first targeted white-box sparse adversarial attack on hashing-based video retrieval.
What problem does this paper attempt to address?