Multi-source Learning for Skeleton -Based Action Recognition Using Deep LSTM Networks

Ran Cui,Aichun Zhu,Sai Zhang,Gang Hua
DOI: https://doi.org/10.1109/icpr.2018.8545247
2018-01-01
Abstract:Skeleton-based action recognition is widely concerned because skeletal information of human body can express action features simply and clearly, and it is not affected by physical features of the human body. Therefore, in this paper, the method of action recognition is based on skeletal information extracted from RGBD video. Since the skeleton coordinates we studied are two-dimensional, our method can be applied to RGB video directly. The recently proposed method based on the deep network only focuses on the temporal dynamic of action and ignores spatial configuration. In this paper, a Multi-source model is proposed based on the fusion of the temporal and spatial models. The temporal model is divided into three branches, which perceive the global-level, local-level, and detail-level information respectively. The spatial model is used to perceive the relative position information of skeleton joints. The fusion of the two models is beneficial to improve the recognition accuracy. The proposed method is compared with the state-of-the-art methods on a large scale dataset. The experimental results demonstrate the effectiveness of our method.
What problem does this paper attempt to address?