ARCTIC: A knowledge distillation approach via attention-based relation matching and activation region constraint for RGB-to-Infrared videos action recognition

Zhenzhen Quan,Qingshan Chen,Yujun Li,Zhi Liu,Yan Cui
DOI: https://doi.org/10.1016/j.cviu.2023.103853
IF: 4.886
2023-10-08
Computer Vision and Image Understanding
Abstract:The recognition effect of existing infrared-based action recognition is greatly reduced when clear appearance and texture are required. To address this limitation, the amalgamation of RGB data presents an opportunity to compensate for this deficiency. However, effectively leveraging RGB data to bridge the gap between RGB and infrared modalities poses a significant challenge. In this paper, we propose a knowledge distillation method based on attention relation matching and activation region consistency constraint (ARCTIC) for RGB-to-Infrared action recognition, which guides infrared data for recognition by acquiring the information of RGB data modality. To enhance the precision of knowledge screening across different modalities at various feature levels and ensure accurate information transfer, we constructed attention-based relation matching modules between different feature layers of the teacher and student networks. To minimize the dissimilarity between the two modalities and obtain more advantageous complementary information, we consider the spatial activation consistency constraint to maintain the consistency between the most salient features of the teacher network and the student network, and employ knowledge distillation loss to favor the selection of accurate predictions while judiciously mitigating the occurrence of erroneous logical outputs. Our experimentation substantiates that the ARCTIC method surpasses the performance of state-of-the-art action recognition techniques across the NTU RGB+D and PKU-MMD datasets.
computer science, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?