Group Activity Recognition Based on Temporal Semantic Sub-Graph Network.

DongLi Wang,Jia Liu,Yan Zhou
DOI: https://doi.org/10.1145/3529836.3529899
2022-01-01
Abstract:Group Activity Recognition is a very important and challenging task in the field of computer vision. Most of the proposed methods only extract the semantic or temporal information of video respectively, while ignoring the important relationship between temporal information and semantic information. In this paper, a more flexible and effective Spatial-Temporal Sub-Graph Network was proposed, which regards the features of each video frame thand e relationship between frames as nodes and edges. respectively. It uses Mixed Pooling Module (MPM)to pool and modify the basic features of video frames. Frame Feature Extraction Module (FFEM) learns node features by integrating context and updating relationship edges frequently, and the Frame Relationship Graph Module (FRGM) localizes each relationship sub-graph and maps each sub-graph into Euclidean space. In order to evaluate the performance of the Network, experiments on two public datasets in group activity recognition field have been conducted.
What problem does this paper attempt to address?