Large-scale Gesture Recognition with a Fusion of RGB-D Data Based on Optical Flow and the C3D Model

Yunan Li,Qiguang Miao,Kuan Tian,Yingying Fan,Xin Xu,Rui Li,Jianfeng Song
DOI: https://doi.org/10.1016/j.patrec.2017.12.003
IF: 4.757
2017-01-01
Pattern Recognition Letters
Abstract:The gesture recognition has raised attention in computer vision owing to its many applications. However, video-based large-scale gesture recognition still faces many challenges, since many factors like background may disturb the accuracy. To achieve gesture recognition with large-scale videos, we propose a method based on RGB-D data. To learn gesture details better, the inputs are expanded into 32-frame videos first, and then the RGB and depth videos are sent to the C3D model to extract spatiotemporal features respectively. Next these features are combined to boost the performance, which can also avoid unreasonable synthetic data due to the uniform dimension of C3D features. Our approach achieves 49.2% accuracy on the validation subset of the Chalearn LAP IsoGD Database just with a linear SVM classifier. It also outperforms the baseline and other methods in the challenge and wins the first place at 56.9% on testing set.
What problem does this paper attempt to address?