Local Spatio-Temporal Feature Based Voting Framework for Complex Human Activity Detection and Localization

Xinye Zhang,Jinshi Cui,Lu Tian,Hongbin Zha
DOI: https://doi.org/10.1109/acpr.2011.6166678
2011-01-01
Abstract:Complex human activity detection is a challenging problem, especially when people interact with each other. Approaches utilizing local spatio-temporal features work well with background clutter, scale and illumination changing. However, most of them focus on classifying short video sequences. In real world applications such as surveillance, it's hard to get the well segmented video clip to classify. So how to detect and localize complex human activities in unsegmented videos is a problem need to be solved. In this paper, based on the local spatio-temporal feature, we propose a variation of Hough Voting method using the Implicit Shape Model which can localize and recognize complex human activity simultaneously. Our approach is tested on the UT-Interaction dataset, and demonstrates promising results in complex human activity detection and localization.
What problem does this paper attempt to address?