Towards Universal Physical Attacks on Single Object Tracking

Li Ding,Yongwei Wang,Kaiwen Yuan,Minyang Jiang,Ping Wang,Hua Huang,Z. Jane Wang
DOI: https://doi.org/10.1609/aaai.v35i2.16211
2021-05-18
Proceedings of the AAAI Conference on Artificial Intelligence
Abstract:Recent studies show that small perturbations in video frames could misguide single object trackers. However, such attacks have been mainly designed for digital-domain videos (i.e., perturbation on full images), which makes them practically infeasible to evaluate the adversarial vulnerability of trackers in real-world scenarios. Here we made the first step towards physically feasible adversarial attacks against visual tracking in real scenes with a universal patch to camouflage single object trackers. Fundamentally different from physical object detection, the essence of single object tracking lies in the feature matching between the search image and templates, and we therefore specially design the maximum textural discrepancy (MTD), a resolution-invariant and target location-independent feature de-matching loss. The MTD distills global textural information of the template and search images at hierarchical feature scales prior to performing feature attacks. Moreover, we evaluate two shape attacks, the regression dilation and shrinking, to generate stronger and more controllable attacks. Further, we employ a set of transformations to simulate diverse visual tracking scenes in the wild. Experimental results show the effectiveness of the physically feasible attacks on SiamMask and SiamRPN++ visual trackers both in digital and physical scenes.
What problem does this paper attempt to address?