Deep Point-Wise Prediction for Action Temporal Proposal

Luxuan Li,Tao Kong,Fuchun Sun,Huaping Liu
DOI: https://doi.org/10.1007/978-3-030-36718-3_40
2019-01-01
Abstract:Detecting actions in videos is an important yet challenging task. Previous works usually utilize (a) sliding window paradigms, or (b) per-frame action scoring and grouping to enumerate the possible temporal locations. Their performances are also limited to the designs of sliding windows or grouping strategies. In this paper, we present a simple and effective method for temporal action proposal generation, named Deep Point-wise Prediction (DPP). DPP simultaneously predicts the action existing possibility and the corresponding temporal locations, without the utilization of any handcrafted sliding window or grouping. The whole system is end-to-end trained with joint loss of temporal action proposal classification and location prediction.
What problem does this paper attempt to address?