Animated Pose Templates for Modelling and Detecting Human Actions.

Benjamin Yao,Zicheng Liu,Xiaohan Nie,Song-Chun Zhu
DOI: https://doi.org/10.1109/TPAMI.2013.144
IF: 23.6
2014-01-01
IEEE Transactions on Pattern Analysis and Machine Intelligence
Abstract:This paper presents Animated Pose Templates for detecting short-term, long-term and contextual actions from cluttered scenes in videos. Each pose template consists of two components: i) a shape template whose appearances represented by the Histogram of Oriented Gradient features; and ii) a motion template using the Histogram of Optical Flow features. While this pose template is suitable for detecting short-term action snippets, we extend it in two ways: i) for long-term actions, we animate the pose templates by adding temporal constraints in a Hidden Markov Model; and ii) for contextual actions, we treat contextual objects as additional parts of the pose templates. To train the model, we manually annotate part locations on some key frames, then introduce a Semi-Supervised Structural SVM algorithm that iterates between: i) learning model parameters from labeled data by solving a structural SVM optimization; and 2) imputing latent variables on unannotated frames and progressively accepting high score ones as newly labelled examples. The inference algorithm has two steps: i) Detecting top candidates for the pose templates; and ii) Computing the sequence of pose templates. Both are done by dynamic programming. In experiments, we test our method on both public and our own datasets. The results show that our model achieves comparable or better performance than state-of-the-art.
What problem does this paper attempt to address?