Attention-Guided Second-Order Pooling Convolutional Networks.

Shannan Chen,Qiule Sun,Cunhua Li,Jianxin Zhang,Qiang Zhang
DOI: https://doi.org/10.1109/ICASSP39728.2021.9414964
2021-01-01
ICASSP
Abstract:Recently, channel attention-guided convolutional networks (ConvNets) have shown great advance on visual recognition tasks. However, they mainly exploit coarse first-order statistics to characterize holistic image and rarely focus on long-range feature dependencies, which limits the representation power in a certain. To handle above limitations, this paper proposes a novel attention-guided second-order pooling convolutional network (ASP-Net). ASP-Net introduces bilinear pooling that captures pairwise feature interactions to model second-order statistics. Meanwhile, it explicitly collects long-range dependencies via non-local operations, thus providing a global view in lower layers. Then, the second-order statistics and non-local context features are fused to obtain the enhanced representation for predicting channel-wise attention map and scaling convolution features. Experiment results on three commonly used datasets illuminate that ASP-Net outperforms its counterparts and achieves competitive performance. The source code is available at https://github.com/ShannanChen/ASPNet.
What problem does this paper attempt to address?