Hierarchical Context Features Embedding for Object Detection

Heqian Qiu,Hongliang Li,Qingbo Wu,Fanman Meng,Linfeng Xu,King Ngi Ngan,Hengcan Shi
DOI: https://doi.org/10.1109/tmm.2020.2971175
IF: 7.3
2020-01-01
IEEE Transactions on Multimedia
Abstract:Pixel-level segmentation has been widely used to improve object detection. Most of the existing methods refine detection features by adding the constraint of the segmentation branch or by simply embedding high-level segmentation features into detection features within the local receptive field. However, noisy segmentation features are unavoidable in real-word applications and can easily cause false positives. To address this problem, we propose a novel hierarchical context embedding module to effectively embed segmentation features into detection features. The idea of this module is to capture hierarchical context information that includes local objects or parts and nonlocal context features by learning multiple attention maps, and subsequently utilize interdependencies between features to recalibrate noisy segmentation features. Furthermore, we use this module in the proposed gated encoder-decoder network that adaptively aggregates feature maps of different resolutions based on the gate mechanism so that we can embed multiscale segmentation feature maps into detection features for more accurate detection of objects of all sizes. Experimental results demonstrate the effectiveness of the proposed method on the Pascal VOC 2012Seg dataset, the Pascal VOC dataset and the MS COCO dataset.
What problem does this paper attempt to address?