Text-Guided Unknown Pseudo-Labeling for Open-World Object Detection

Xuefei Wang,Dong Xu
DOI: https://doi.org/10.3390/electronics13224528
IF: 2.9
2024-11-27
Electronics
Abstract:Open-world object detection (OWOD) focuses on training models with partially known class labels, enabling the detection of objects from known classes while concurrently identifying objects from unknown classes. Current models often perform suboptimally in generating pseudo-labels for unknown objects based on objectness scores due to inherent biases towards known classes. To address this issue, we propose a cross-modal learning model named Text-Guided Unknown Pseudo-Labeling for Open-world Object Detection(TGOOD) building on the Featurized Query R-CNN (FQR-CNN) framework. Specifically, we introduce a module called Similarity-Random-Similarity (SRS) to guide the model in detecting unknown objects during training. Additionally, we replace the one-to-one label assignment strategy in FQR-CNN with a one-to-many (OTM) label assignment strategy to provide more supervisory information during training. Moreover, we propose the ROI features Refinement Module (RRM) to enhance the discriminability of all objects. Experimental evaluations on the PASCAL VOC, MS-COCO, and COCO-O benchmarks demonstrate TGOOD's superior open-world detection capability.
engineering, electrical & electronic,computer science, information systems,physics, applied
What problem does this paper attempt to address?