Global Context Aggregation Network for Temporal Action Proposal Generation

Jinqiang Peng,Yuanyuan Gao,Feng Zhang,Wei Fang
DOI: https://doi.org/10.1109/iseeie55684.2022.00023
2022-01-01
Abstract:Temporal action proposal generation is an important yet challenging task which aims to locate action boundary in untrimmed long video. Existing methods usually generate precise action boundary with local context, but cannot generate reliable confidence scores for retrieving proposals. To address this difficulty, we aggregate global contextual information to extract rich features when evaluating proposal confidence score. In this paper, we propose Global Context Aggregation Network (GCA-Net) to generate temporal action proposals. The main advantage of GCA-Net lies in the fact that global context contains rich semantic and temporal information to improve the accuracy of proposal confidence score. Specifically, GCA-Net utilizes U-shaped architecture and RNN to extract uniformly fused global semantic features and bi-direction global temporal features respectively. Then proposal evaluation block is proposed to merge these global contextual features with local regular features to evaluate confidence scores of densely distributed proposals. Extensive experiments are conducted on two challenging datasets: THUMOS-14 and ActivityNet-1.3, where GCA-Net can generate proposals with high precision and recall. By combining with the existing action classifier, GCA-Net can obtain remarkable temporal action detection performance compared with other methods.
What problem does this paper attempt to address?