Stagnet: An Attentive Semantic Rnn For Group Activity Recognition

Mengshi Qi,Jie Qin,Annan Li,Yunhong Wang,Jiebo Luo,Luc Van Gool
DOI: https://doi.org/10.1007/978-3-030-01249-6_7
2018-01-01
Abstract:Group activity recognition plays a fundamental role in a variety of applications, e.g. sports video analysis and intelligent surveillance. How to model the spatio-temporal contextual information in a scene still remains a crucial yet challenging issue. We propose a novel attentive semantic recurrent neural network (RNN), dubbed as stagNet, for understanding group activities in videos, based on the spatio-temporal attention and semantic graph. A semantic graph is explicitly modeled to describe the spatial context of the whole scene, which is further integrated with the temporal factor via structural-RNN. Benefiting from the `factor sharing' and `message passing' mechanisms, our model is capable of extracting discriminative spatio-temporal features and capturing inter-group relationships. Moreover, we adopt a spatio-temporal attention model to attend to key persons/frames for improved performance. Two widely-used datasets are employed for performance evaluation, and the extensive results demonstrate the superiority of our method.
What problem does this paper attempt to address?