Action And Gesture Temporal Spotting With Super Vector Representation

Xiaojiang Peng,Limin Wang,Zhuowei Cai,Yu Qiao
DOI: https://doi.org/10.1007/978-3-319-16178-5_36
2015-01-01
Abstract:This paper focuses on describing our method designed for both track 2 and track 3 at Looking at People (LAP) challenging [1]. We propose an action and gesture spotting system, which is mainly composed of three steps: (i) temporal segmentation, (ii) clip classification, and (iii) post processing. For track 2, we resort to a simple sliding window method to divide each video sequence into clips, while for track 3, we design a segmentation method based on the motion analysis of human hands. Then, for each clip, we choose a kind of super vector representation with dense features. Based on this representation, we train a linear SVM to conduct action and gesture recognition. Finally, we use some post processing techniques to void the detection of false positives. We demonstrate the effectiveness of our proposed method by participating the contests of both track 2 and track 3. We obtain the best performance on track 2 and rank 4th on track 3, which indicates that the designed system is effective for action and gesture recognition.
What problem does this paper attempt to address?