A Pursuit of Temporal Accuracy in General Activity Detection.

Yuanjun Xiong,Yue Zhao,Limin Wang,Dahua Lin,Xiaoou Tang
DOI: https://doi.org/10.48550/arxiv.1703.02716
2017-01-01
Abstract:Detecting activities in untrimmed videos is an important but challenging task. The performance of existing methods remains unsatisfactory, e.g., they often meet difficulties in locating the beginning and end of a long complex action. In this paper, we propose a generic framework that can accurately detect a wide variety of activities from untrimmed videos. Our first contribution is a novel proposal scheme that can efficiently generate candidates with accurate temporal boundaries. The other contribution is a cascaded classification pipeline that explicitly distinguishes between relevance and completeness of a candidate instance. On two challenging temporal activity detection datasets, THUMOS14 and ActivityNet, the proposed framework significantly outperforms the existing state-of-the-art methods, demonstrating superior accuracy and strong adaptivity in handling activities with various temporal structures.
What problem does this paper attempt to address?