Action Recognition by Jointly Using Video Proposal and Trajectory

Lei Qi,Xiaoqiang Lu,Xuelong Li
DOI: https://doi.org/10.1145/3271553.3271563
2018-01-01
Abstract:As a popular research field in computer vision community, human action recognition in videos is a challenging task. In recent years, trajectory based methods have been proven effective for action recognition. However, because trajectory is generated around motion region, trajectory based methods often only pay attention to regions with high motion salience in video and ignore motionless but semantic objects. To compensate the shortage of trajectory based methods, video proposal is utilized for its ability to discover semantic object in this paper. In the proposed method, video proposal and trajectory are extracted simultaneously to capture motion information and object information. The proposed method can be divided into three steps: 1) trajectories and video proposals are extracted from video to capture motion information and object information respectively; 2) a trained Convolution Neural Network (CNN) model is employed to describe the extracted trajectories and video proposals; 3) the holistic representation of video is constructed by Fisher Vector model and then input to classifier to get the action label. The complementarity between trajectory and video proposal enables the discrimination power of the proposed method for kinds of actions. The proposed method is evaluated on UCF101 and HMDB51, on which the promising results prove the effectiveness of the proposed method.
What problem does this paper attempt to address?