Local Spatiotemporal Coding and Sparse Representation Based Human Action Recognition

Bin Wang,Yu Liu,Wei Wang,Wei Xu,Maojun Zhang
DOI: https://doi.org/10.4028/www.scientific.net/amm.401-403.1555
2013-01-01
Applied Mechanics and Materials
Abstract:To handle with the limitation of bag-of-features (BoF) model which ignores spatial and temporal relationships of local features in human action recognition in video, a Local Spatiotemporal Coding (LSC) is proposed. Rather than the exiting methods only uses the feature appearance information for coding, LSC encodes feature appearance and spatiotemporal positions information simultaneously with vector quantization (VQ). It can directly models the spatiotemporal relationships of local features in space time volume (STV). In implement, the local features are projected into sub-space-time-volume (sub-STV), and encoded with LSC. In addition a multi-level LSC is also provided. Then a group of sub-STV descriptors obtained from videos with multi-level LSC and Avg-pooling are used for action video classification. A sparse representation based classification method is adopted to classify action videos upon these sub-STV descriptors. The experimental results on KTH, Weizmann, and UCF sports datasets show that our method achieves better performance than the previous local spatiotemporal features based human action recognition methods.
What problem does this paper attempt to address?