Short-Term Action Recognition by 3D Convolutional Neural Network with Pixel-Wise Evidences

XiaoHan Wang,Junichi Miyao,Takio Kurita
DOI: https://doi.org/10.1007/978-981-15-4818-5_6
2020-01-01
Abstract:Action recognition in videos is becoming popular these years. The difficulty is how to extract the temporal information, which is important in the target actions. In this paper, we propose a conceptually, simple network for short-term action recognition. The proposed network architecture is extended from standard neural network to Autoencoder, which estimates pixel-wise evidence in frames, and they are integrated to classify the actions in the simple classifier. In the proposed architecture, the standard 2D convolutional layers for image classification are extended to 3D convolutional layers in the Autoencoder to extract the temporal information in the target actions. In the training phase, classifiers are introduced in the middle of layer to let the features of the middle layers are well discriminated. Also, classifiers are introduced at the end of layer to improve performance of the standard classifier. We have performed experiments using UCF101 dataset to evaluate the effectiveness of the proposed architecture. The results show that our methods can get efficient performance in short-term action recognition.
What problem does this paper attempt to address?