Temporal Inception Architecture for Action Recognition with Convolutional Neural Networks.

Wei Zhang,Jiepeng Cen,Huicheng Zheng
DOI: https://doi.org/10.1109/icpr.2018.8545720
2018-01-01
Abstract:Modeling appearance and short-term dynamic information is the mainstream strategy for action recognition based on deep learning. We consider it important to model the multi-scale temporal information, including both short-term information and long-term information, for action representation. In this paper, a novel temporal inception architecture (TIA) is proposed to solve this problem, which is a general structure that can be combined with multi-segment-based frameworks for action recognition. The TIA is composed of multiple spatial-temporal convolutional branches, in which the temporal information of different scales is extracted. Then feature maps of all branches are concatenated as the output of TIA. In our experiments, the TIA is embedded into temporal segment networks (TSN) to construct our temporal segment inception networks (TSIN) for action recognition tasks. Extensive experiments demonstrate that TSIN outperforms TSN and achieves the state-of-the-art performance on HMDB51 and UCF101.
What problem does this paper attempt to address?