Semantic-Guided Relation Propagation Network for Few-shot Action Recognition

Xiao Wang,Weirong Ye,Zhongang Qi,Xun Zhao,Guangge Wang,Ying Shan,Hanzi Wang
DOI: https://doi.org/10.1145/3474085.3475253
2021-01-01
Abstract:Few-shot action recognition has drawn growing attention as it can recognize novel action classes by using only a few labeled samples. In this paper, we propose a novel semantic-guided relation propagation network (SRPN), which leverages semantic information together with visual information for few-shot action recognition. Different from most previous works that neglect semantic information in the labeled data, our SRPN directly utilizes the semantic label as an additional supervisory signal to improve the generalization ability of the network. Besides, we treat the relation of each visual-semantic pair as a relational node, and we use a graph convolutional network to model and propagate such sample relations across visual-semantic pairs, including both intra-class commonality and inter-class uniqueness, to guide the relation propagation in the graph. However, since videos contain crucial sequences and ordering information, we propose a novel spatial-temporal difference module, which can facilitate the network to enhance the visual feature learning ability at both feature level and granular level for videos. Extensive experiments conducted on several challenging benchmarks demonstrate that our SRPN outperforms several state-of-the-art methods with a significant margin.
What problem does this paper attempt to address?