Strategy for dynamic 3D depth data matching towards robust action retrieval.

Sicheng Zhao,Lujun Chen,Hongxun Yao,Yanhao Zhang,Xiaoshuai Sun
DOI: https://doi.org/10.1016/j.neucom.2014.03.092
IF: 6
2015-01-01
Neurocomputing
Abstract:3D depth data, especially dynamic 3D depth data, offer several advantages over traditional intensity videos for expressing objects׳ actions, such as being useful in low light levels, resolving the silhouette ambiguity of actions, and being color and texture invariant. With the wide popularity of somatosensory equipment (Kinect for example), more and more dynamic 3D depth data are shared on the Internet, which results in an urgent need to retrieve these data efficiently and effectively. In this paper, we propose a generalized strategy for dynamic 3D depth data matching and apply this strategy in action retrieval task. Firstly, an improved 3D shape context descriptor (3DSCD) is proposed to extract features of each static depth frame. Then we employ dynamic time warping (DTW) to measure the temporal similarity between two 3D dynamic depth sequences. Experimental results on our collected dataset consisting of 170 dynamic 3D depth video clips show that the proposed 3DSCD has a rich descriptive power on depth data and that the method using 3DSCD and DTW achieves high matching accuracy. Finally, to address the matching efficiency problem, we utilize the bag of word (BoW) model to quantize the 3DSCD of each static depth frame into visual word packages. So the original feature matching problem is simplified into a two-histogram matching problem. The results demonstrate the matching efficiency of our proposed method, while still maintaining high matching accuracy.
What problem does this paper attempt to address?