Spatio-temporal Semantic Features for Human Action Recognition.

Jia Liu,Xiaonian Wang,Tianyu Li,Jie Yang
DOI: https://doi.org/10.3837/tiis.2012.10.011
2012-01-01
KSII Transactions on Internet and Information Systems
Abstract:Most approaches to human action recognition is limited due to the use of simple action datasets under controlled environments or focus on excessively localized features without sufficiently exploring the spatio-temporal information. This paper proposed a framework for recognizing realistic human actions. Specifically, a new action representation is proposed based on computing a rich set of descriptors from keypoint trajectories. To obtain efficient and compact representations for actions, we develop a feature fusion method to combine spatial-temporal local motion descriptors by the movement of the camera which is detected by the distribution of spatio-temporal interest points in the clips. A new topic model called Markov Semantic Model is proposed for semantic feature selection which relies on the different kinds of dependencies between words produced by "syntactic" and "semantic" constraints. The informative features are selected collaboratively based on the different types of dependencies between words produced by short range and long range constraints. Building on the nonlinear SVMs, we validate this proposed hierarchical framework on several realistic action datasets.
What problem does this paper attempt to address?