Continuous Action Recognition Based on Hybrid CNN-LDCRF Model

Jun Lei,Guohui Li,Shuohao Li,Dan Tu,Qiang Guo
DOI: https://doi.org/10.1109/icivc.2016.7571275
2016-01-01
Abstract:Continuous action recognition in video is more challenging compared with traditional isolated action recognition. In this paper, we proposed a hybrid framework combining Convolutional Neural Network (CNN) and Latent-Dynamic Conditional Random Field (LDCRF) to segment and recognize continuous actions simultaneously. Most existing action recognition works construct complex handcrafted features, which are highly problem dependent. We utilize CNN model, a type of deep models, to automatically learn high level action features directly from raw inputs. The LDCRF model is used to model the intrinsic and extrinsic dynamics of actions. The CNN is embedded in the bottom layer of LDCRF, which converts the structure of LDCRF from shallow to deep. This framework incorporates action feature learning and continuous action recognition procedures in a unified way. The training of our model is in end-to-end fashion. The parameters of CNN and LDCRF are jointly optimized by gradient descent algorithm. We test our method on two public dataset: KTH and HumanEva. Experiment shows our method achieves improved recognition accuracy compared with several other methods. We also demonstrate the superiority of features learnt by CNN compared with handcrafted features.
What problem does this paper attempt to address?