Key Role Guided Transformer for Group Activity Recognition

Duoxuan Pei,Di Huang,Longteng Kong,Yunhong Wang
DOI: https://doi.org/10.1109/tcsvt.2023.3283282
IF: 5.859
2023-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Group Activity Recognition (GAR) is a challenging task, where modeling spatio-temporal relationships among participants plays a fundamental role. To address this issue, we propose a novel end-to-end trainable network, termed Key Role Guided Transformer (KRGFormer). Different from current methods that concurrently take all individuals into account for global reasoning, it captures crucial contextual information by emphasizing a set of key individuals in a coarse-to-fine manner considering that group activities are usually dominated by them. A Key Individual-aware Block (KIaBlock) is designed to select relevant individuals and enhance their relationships with the reservation of global dependencies of the entire group. The representations are then iteratively refined by deploying multiple stacked KIaBlocks, leading to a stronger discriminative power to distinguish group activities. Moreover, along with general data augmentation schemes, several “actor-centric” ones are presented to relieve the over-fitting risk, which further boost the performance. We extensively evaluate the proposed approach on the Volleyball, VolleyTactic and NBA datasets, and the experimental results demonstrate its superiority.
What problem does this paper attempt to address?