Multi-modality Fusion Network for Action Recognition.

Kai Huang,Zheng Qin,Kaiping Xu,Shuxiong Ye,Guolong Wang
DOI: https://doi.org/10.1007/978-3-319-77383-4_14
2017-01-01
Abstract:Deep neural networks have outperformed many traditional methods for action recognition on video datasets, such as UCF101 and HMDB51. This paper aims to explore the performance of fusion of different convolutional networks with different dimensions. The main contribution of this work is multi-modality fusion network (MMFN), a novel framework for action recognition, which combines 2D ConvNets and 3D ConvNets. The accuracy of MMFN outperforms the state-of-the-art deep-learning-based methods on the datasets of UCF101 (94.6%) and HMDB51 (69.7%).
What problem does this paper attempt to address?