Action recognition and detection by combining motion and appearance features

Limin Wang,Yu Qiao,Xiaoou Tang
2014-01-01
Abstract:We present an action recognition and detection system from temporally untrimmed videos by combining motion and appearance features. Motion and appearance provides two complementary cues for human action understanding from videos. For motion features, we adopt the Fisher vector representation with improved dense trajectories due to its rich descriptive capacity. For appearance feature, we choose the deep convolutional neural network activations due to its recent success in image based tasks. With this fused feature of iDT and CNN, we train a SVM classifier for each action class in one-vs-all scheme. We report both the recognition and detection results of our system on THUMOS 14 Challenge.
What problem does this paper attempt to address?