An attention-based spatial-temporal hierarchical ConvLSTM network for action recognition in videos

Hongbing Ji,Fei Xue,Wenbo Zhang,Yi Cao
DOI: https://doi.org/10.1049/iet-cvi.2018.5830
IF: 1.484
2019-01-01
IET Computer Vision
Abstract:Human action recognition in videos is an important research topic in computer vision due to its wide applications. Actions naturally contain both spatial and temporal information. The key to action recognition is to model the spatial and temporal structures of actions. In this study, the authors propose an attention-based spatial-temporal hierarchical convolutional long short-term memory (ST-HConv...
What problem does this paper attempt to address?