Disentangled Non-Local Neural Networks

Minghao Yin,Zhuliang Yao,Yue Cao,Xiu Li,Zheng Zhang,Stephen Lin,Han Hu
DOI: https://doi.org/10.1007/978-3-030-58555-6_12
2020-01-01
Abstract:The non-local block is a popular module for strengthening the contextmodeling ability of a regular convolutional neural network. This paper firststudies the non-local block in depth, where we find that its attentioncomputation can be split into two terms, a whitened pairwise term accountingfor the relationship between two pixels and a unary term representing thesaliency of every pixel. We also observe that the two terms trained alone tendto model different visual clues, e.g. the whitened pairwise term learnswithin-region relationships while the unary term learns salient boundaries.However, the two terms are tightly coupled in the non-local block, whichhinders the learning of each. Based on these findings, we present thedisentangled non-local block, where the two terms are decoupled to facilitatelearning for both terms. We demonstrate the effectiveness of the decoupleddesign on various tasks, such as semantic segmentation on Cityscapes, ADE20Kand PASCAL Context, object detection on COCO, and action recognition onKinetics.
What problem does this paper attempt to address?