A Stronger Baseline for Ego-Centric Action Detection

Zhiwu Qing,Ziyuan Huang,Xiang Wang,Yutong Feng,Shiwei Zhang,Jianwen Jiang,Mingqian Tang,Changxin Gao,Marcelo H. Ang Jr,Nong Sang
2021-01-01
Abstract: This technical report analyzes an egocentric video action detection method we used in the 2021 EPIC-KITCHENS-100 competition hosted in CVPR2021 Workshop. The goal of our task is to locate the start time and the end time of the action in the long untrimmed video, and predict action category. We adopt sliding window strategy to generate proposals, which can better adapt to short-duration actions. In addition, we show that classification and proposals are conflict in the same network. The separation of the two tasks boost the detection performance with high efficiency. By simply employing these strategy, we achieved 16.10\% performance on the test set of EPIC-KITCHENS-100 Action Detection challenge using a single model, surpassing the baseline method by 11.7\% in terms of average mAP.
What problem does this paper attempt to address?