Action Recognition Based on Spatio-temporal Log-Euclidean Covariance Matrix

Shilei Cheng,Jiangfeng Yang,Zheng Ma,Mei Xie
DOI: https://doi.org/10.14257/ijsip.2016.9.2.09
2016-01-01
Abstract:In this paper, we handle the problem of human action recognition by combining covariance matrices as local spatio-temporal (ST) descriptors and local ST features extracted densely from action video. Unlike traditional methods that separately utilizing gradient-based feature and optical flow-based feature, we use covariance matrix to fuse the two types of feature. Since covariance matrices are Symmetric Positive Definite (SPD) matrices, which form a special type of Riemannian manifold. To measure the distance of SPDs while avoid computing the geodesic distance between them, covariance features are transformed to log-Euclidean covariance matrices (LECM) by matrix logarithm operation. After encoding LECM by Locality-constrained Linear Coding method, in order to provide position information to ST-LECM features, spatial pyramid is used to partition the video frames, and the average-pooling-on-absolute-value function is implemented over each sub-frames. Finally, non-linear support vector machine is used as classifier. Experiments on public human action datasets show that the proposed method obtains great improvements in recognition accuracy, in comparison to several state-ofthe-art methods.
What problem does this paper attempt to address?