Learning 3D Compact Binary Descriptor for Human Action Recognition in Video.

Dongcheng Huang,Xiang Li,Hongwei Li,Wei-Shi Zheng
DOI: https://doi.org/10.1007/978-3-319-25417-3_72
2015-01-01
Abstract:Hand-crafted descriptors are widely used for human action recognition in video at present. However, they are not optimized and may lack discriminative information. To compensate this drawback, this paper presents a learning-based 3D compact binary descriptor (3D-CBD) for human action video representation. The proposed descriptor is a 3D extension of the compact binary face descriptor (CBFD). Given a video sequence, we first extract pixel difference vectors (PDVs) in local volumes and then learn a feature mapping to project these PDVs into low-dimensional binary vectors. Finally, we cluster and pool these binary codes into histogram feature as the representation of the video sequence. Experimental results on two action datasets (KTH and WEIZMANN) demonstrate the effectiveness of the proposed descriptor.
What problem does this paper attempt to address?