A New Architecture of Neural Network for Fine-Grained Video Analysis Based on Visual Attention

LI Lin,SUN Kangbo,ZHU Jie
DOI: https://doi.org/10.3969/j.issn.1000-5137.2019.04.005
2019-01-01
Abstract:Based on a subtle gesture video dataset,a deep network was proposed to improve the performance on such fine-grained actions in this paper.The architecture consisted of a smaller variation of convolutional 3-dimensional (C3D) network and a long short-term memory (LSTM) with a soft attention mechanism.The depth of C3D network and the weight penalty of the attention mechanism were optimized.Experimental results showed that the fine-grained action recognition network could effectively focus on the certain part with important information,and performed better on both average accuracy and detection accuracy for the dataset.
What problem does this paper attempt to address?