Exploring Context Information for Accurate and Fast Object Detection

Zhenjun Shi,Xiaoqi Li,Bin Zhang
DOI: https://doi.org/10.1007/978-3-030-31654-9_20
2019-01-01
Abstract:Current top-performing object detectors depend on deep CNN backbones, such as ResNet-101 and InceptionNet, benefiting from their powerful feature representations but suffering from high computational costs. Conversely, some lightweight model based detectors can run at real time speed, while their performance is inferior to those equipped with powerful backbone network. In this paper, we propose an effective yet efficient one-stage detector. The proposed detector inherits the architecture of SSD and introduces two novel modules, Feature Enhancement Module (FEM) and Feature Fusion Module (FFM). The FEM could strengthen features by increasing the size of receptive field and introducing more context, while The FFM could enhance the shallow part of the detector by fusing two adjacent feature maps. To evaluate their effectiveness, experiments are conducted on two major benchmarks. Experimental results demonstrate that the proposed detector performs much better than the original SSD, without losing real-time processing speed.
What problem does this paper attempt to address?