Action Recognition Through Discovering Distinctive Action Parts

Feifei Chen,Nong Sang,Xiaoqin Kuang,Haitao Gan,Changxin Gao
DOI: https://doi.org/10.1364/josaa.32.000173
2015-01-01
Abstract:Recent methods based on midlevel visual concepts have shown promising capabilities in the human action recognition field. Automatically discovering semantic entities such as action parts remains challenging. In this paper, we present a method of automatically discovering distinctive midlevel action parts from video for recognition of human actions. We address this problem by learning and selecting a collection of discriminative and representative action part detectors directly from video data. We initially train a large collection of candidate exemplar-linear discriminant analysis detectors from clusters obtained by clustering spatiotemporal patches in whitened space. To select the most effective detectors from the vast array of candidates, we propose novel coverage-entropy curves (CE curves) to evaluate a detector's capability of distinguishing actions. The CE curves characterize the correlation between the representative and discriminative power of detectors. In the experiments, we apply the mined part detectors as a visual vocabulary to the task of action recognition on four datasets: KTH, Olympic Sports, UCF50, and HMDB51. The experimental results demonstrate the effectiveness of the proposed method and show the state-of-the-art recognition performance.
What problem does this paper attempt to address?