Arbitrary-View Human Action Recognition: A Varying-View RGB-D Action Dataset

Yanli Ji,Yang,Fumin Shen,Heng Tao Shen,Wei-Shi Zheng
DOI: https://doi.org/10.1109/tcsvt.2020.2975845
IF: 5.859
2021-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Current researches of action recognition which focus on single-view and multi-view recognition can hardly satisfy the requirements of human-robot interaction (HRI) applications for recognizing human actions from arbitrary views. Arbitrary-view recognition is still a challenging issue due to view changes and visual occlusions. In addition, the lack of datasets also sets up barriers. To provide data for arbitrary-view action recognition, we collect a new large-scale RGB-D action dataset for arbitrary-view action analysis, including RGB videos, depth and skeleton sequences. The dataset includes action samples captured in 8 fixed viewpoints and varying-view sequences which cover the entire 360° view angles. In total, 118 persons are invited to act 40 action categories. Our dataset involves more participants, more viewpoints and a large number of samples. More importantly, it is the first dataset containing the entire 360° varying-view sequences. The dataset provides sufficient data for multi-view, cross-view and arbitrary-view action analysis. Besides, we propose a View-guided Skeleton CNN (VS-CNN) to tackle the problem of arbitrary-view action recognition. Experiment results show that the VS-CNN achieves superior performance, and our dataset provides valuable but challenging data for the evaluation of arbitrary-view recognition.
What problem does this paper attempt to address?