A novel fast combine-and-conquer object detector based on only one-level feature map
Jianhua Yang,Ke Wang,Ruifeng Li,Zhonghao Qin,Petra Perner
DOI: https://doi.org/10.1016/j.cviu.2022.103561
IF: 4.886
2022-11-01
Computer Vision and Image Understanding
Abstract:In this paper, we present a conceptually simple, flexible, and efficient ”Combine-and-Conquer” detection framework. The proposed framework is composed of a very simple one-level detection pipeline. We modularized the proposed framework into four parts, Backbone, Neck, Feature Aggregation and Detection Head. To verify the performance of this framework, we design a simple yet strong detector, CC-Det. First CC-Det deploys a backbone network to encode the input image, and then uses a neck network to extract rich features and enlarge the receptive field. Next, a feature aggregation network is deployed to aggregate multi-scale features into one feature map. Finally, only one detection head is deployed on the one-level feature map to output a heatmap and bounding boxes. Compared with existing multi-level detectors such as RetinaNet and FCOS, CC-Det achieves excellent performance with much fewer parameters of model and much lower FLOPs. In addition, CC-Det also achieves better trade-off among speed, accuracy and model size without any elaborate special design, compared to other one-level detectors. Moreover, CC-Det is easy to generalize to other tasks with minor modifications and achieves state-of-the-art performance. Excellent results are presented on COCO, PASCAL VOC, WiderFace and CrowdHuman datasets.
computer science, artificial intelligence,engineering, electrical & electronic