Cross-View Action Recognition Based on Hierarchical View-Shared Dictionary Learning.

Chengkun Zhang,Huicheng Zheng,Jianhuang Lai
DOI: https://doi.org/10.1109/access.2018.2815611
IF: 3.9
2018-01-01
IEEE Access
Abstract:Recognizing human actions across different views is challenging, since observations of the same action often vary greatly with viewpoints. To solve this problem, most existing methods explore the cross-view feature transfer relationship at video level only, ignoring the sequential composition of action segments therein. In this paper, we propose a novel hierarchical transfer framework, which is based on an action temporal-structure model that contains sequential relationship between action segments at multiple timescales. Thus, it can capture the view invariance of the sequential relationship of segment-level transfer. Additionally, we observe that the original feature distributions under different views differ greatly, leading to view-dependent representations irrelevant to the intrinsic structure of actions. Thus, at each level of the proposed framework, we transform the original feature spaces of different views to a view-shared low dimensional feature space, and jointly learn a dictionary in this space for these views. This view-shared dictionary captures the common structure of action data across the views and can represent the action segments in a way robust to view changes. Moreover, the proposed method can be kernelized easily, and operate in both unsupervised and supervised cross-view scenarios. Extensive experimental results on the IXMAS and WVU datasets demonstrate superiority of the proposed method over state-of-the-art methods.
What problem does this paper attempt to address?