Few-shot Object Detection Based on Self-Supervised Feature Pyramid Network

Wen Lv,Xinwei Qi,Hongbo Shi,Shuai Tan,Bing Song,Yang Tao
DOI: https://doi.org/10.1117/1.jei.33.2.023021
IF: 0.829
2024-01-01
Journal of Electronic Imaging
Abstract:In few-shot object detection, the limited amount of labeled data fails to adequately represent all possible scenarios and objects. This limitation leads to the model's inability to fully learn the features and attributes of novel classes. In some cases, novel classes may be confused with base classes due to feature similarity, resulting in inaccurate detection results. During the two-stage fine-tuning phase, when there is a significant difference between the novel class and the base class data, the candidate boxes generated by the backbone network trained on the base class may not be suitable for the novel class targets. Therefore, it is a forward-looking research problem to explore how to mine novel class knowledge during the feature extraction process to supplement the disadvantage of having limited feature samples and improve the sensitivity in recognizing novel classes. To enrich the feature representation of novel classes, we propose a self-supervised feature pyramid network. This approach explores novel class attributes in the lower-level network, thereby encouraging the feature extractor to generate candidate boxes that are consistent with the novel class targets. The goal is to enhance the sensitivity of the backbone network in recognizing novel classes. We validate the effectiveness of our proposed framework by comparing it with state-of-the-art methods on two popular datasets and achieve an improvement of up to +5.2% on the standard PASCAL VOC benchmark and a 1.4% boost on the challenging COCO benchmark. (c) 2023 SPIE and IS&T
What problem does this paper attempt to address?