Recognizing Actions Via Sparse Coding on Structure Projection

Lei Zhang,Tao Wang,Xiantong Zhen
DOI: https://doi.org/10.1109/icip.2013.6738497
2013-01-01
Abstract:In this paper, we propose a novel method for human action recognition based on sparse coding with a pyramid matching. Spatio-temporal interest points (STIPs) are firstly detected by a newly developed detector named spatio-temporal steerable detector (STSD). To effectively capture the distribution of STIPs in the video sequence, we propose to project the STIPs onto the three orthogonal planes (TOP), and we employ a sparse coding algorithm combined with the spatial pyramid matching to encode the layout of STIPs. Therefore the structure of an action are sufficiently encoded, obtaining a informative holistic descriptor for action representation. Extensive experiments have been conducted on KTH and HMDB51 datasets. Our method achieves the state-of-the-art performance for action recognition showing the effectiveness of the proposed methods for human action representation.
What problem does this paper attempt to address?