Feature refinement with multi-level context for object detection

Yingdong Ma,Yanan Wang
DOI: https://doi.org/10.1007/s00138-023-01402-5
IF: 2.983
2023-01-01
Machine Vision and Applications
Abstract:Robust multi-scale object detection is challenging as it requires both spatial details and semantic knowledge to deal with problems including high scale variation and cluttered background. Appropriate fusion of high-resolution features with deep semantic features is the key issue to achieve better performance. Different approaches have been developed to extract and combine deep features with shallow layer spatial features, such as feature pyramid network. However, high-resolution feature maps contain noisy and distractive features. Directly combines shallow features with semantic features might degrade detection accuracy. Besides, contextual information is also important for multi-scale object detection. In this work, we present a feature refinement scheme to tackle the feature fusion problem. The proposed feature refinement module increases feature resolution and refine feature maps progressively with the guidance from deep features. Meanwhile, we propose a context extraction method to capture global and local contextual information. The method utilizes a multi-level cross-pooling unit to extract global context and a cascaded context module to extract local context. The proposed object detection framework has been evaluated on PASCAL VOC and MS COCO datasets. Experimental results demonstrate that the proposed method performs favorably against state-of-the-art approaches.
What problem does this paper attempt to address?