Extreme Low-Resolution Action Recognition with Confident Spatial-Temporal Attention Transfer

Yucai Bai,Qin Zou,Xieyuanli Chen,Lingxi Li,Zhengming Ding,Long Chen
DOI: https://doi.org/10.1007/s11263-023-01771-4
IF: 13.369
2023-03-09
International Journal of Computer Vision
Abstract:Action recognition on extreme low-resolution videos, e.g., a resolution of pixels, plays a vital role in far-view surveillance and privacy-preserving multimedia analysis. As low-resolution videos often only contain limited information, it is difficult for us to perform action recognition in them. Given the fact that one same action may be represented by videos in both high resolution (HR) and extreme low resolution (eLR), it is worth studying to utilize the relevant HR data to improve the eLR action recognition. In this work, we propose a novel Confident Spatial-Temporal Attention Transfer (CSTAT) for eLR action recognition. CSTAT acquires information from HR data by reducing the attention differences with a transfer-learning strategy. Besides, the confidence of the supervisory signal is also taken into consideration for a more reliable transferring process. Experimental results demonstrate that, the proposed method can effectively improve the accuracy of eLR action recognition and achieve state-of-the-art performances on HMDB51, Kinects-400, and Something-Something v2.
computer science, artificial intelligence
What problem does this paper attempt to address?