Enhancing representation learning by exploiting effective receptive fields for object detection

Qijin Wang,Shengyu Zhang,Yu Qian,Guangcai Zhang,Hongqiang Wang
DOI: https://doi.org/10.1016/j.neucom.2022.01.020
IF: 6
2022-04-01
Neurocomputing
Abstract:Most of state-of-the-art object detectors depend on multiple anchors/reference boxes in representation learning. However, such anchor-based representation does not completely match with the visual information perceived by the sliding windows, thus degrading the overall performance of object detection. In this paper, we present an effective receptive field (eRF)-dependent region proposal network (eRPN) for proposal generation, which enhances the anchor-based representation via eRFs. Specifically, we define an eRF for each sliding window on the feature map and only encode objects within the eRF for unbiasedly representation learning. The size of eRF depends on its backbone network. An eRF-based matching rule is devised and combined with the commonly used IoU rule for pertinent sample selection. We also design an eRF filter module, which can be appended to RPN for eliminating redundant low-quality region proposals in inference time. eRPN enhances representation learning from two perspectives: input information and sample balance, to make generating region proposals more robust. We evaluate eRPN by combining with two commonly used detection heads: Faster RCNN and Faster RCNN w FPN(Faster-FPN). Experimental results on PASCAL VOC dataset and MS COCO dataset benchmarks demonstrate the effectiveness of the proposed method in learning representation for object detection.
computer science, artificial intelligence
What problem does this paper attempt to address?