Projection Transform on Spatio-Temporal Context for Action Recognition

Wanru Xu,Zhenjiang Miao,Qiang Zhang
DOI: https://doi.org/10.1007/s11042-014-2007-1
IF: 2.577
2014-01-01
Multimedia Tools and Applications
Abstract:This paper discusses the task of human action recognition. This task is important to applications like video surveillance and video retrieval. Most of the existing local interest points based works on human action analysis, lost the information about spatio-temporal distribution of features and neglected the relationship between features and each defined actions. In this paper, through the analysis of feature distribution and their interactions over spatio-temporal domain, we propose a novel projection transform to take the two factors into account. A video sequence of human action in our perspective can be modeled by three types of features of spatio-temporal interest points: the global projection transform feature, the relative position distribution feature and the bag of visual words based feature. Then a new context-K-nearest-neighbor classifier is utilized to fuse them to form discriminative feature sets for action matching. In most of the case, our experiments have indicated that the novel method outperforms other previous published results on the Weizmann and KTH datasets.
What problem does this paper attempt to address?