Abstract:Open World Object Detection (OWOD) aims to adapt object detection to an open-world environment, so as to detect unknown objects and learn knowledge incrementally. Existing OWOD methods typically leverage training sets with a relatively small number of known objects. Due to the absence of generic object knowledge, they fail to comprehensively perceive objects beyond the scope of training sets. Recent advancements in large vision models (LVMs), trained on extensive large-scale data, offer a promising opportunity to harness rich generic knowledge for the fundamental advancement of OWOD. Motivated by Segment Anything Model (SAM), a prominent LVM lauded for its exceptional ability to segment generic objects, we first demonstrate the possibility to employ SAM for OWOD and establish the very first SAM-Guided OWOD baseline solution. Subsequently, we identify and address two fundamental challenges in SAM-Guided OWOD and propose a pioneering SAM-Guided Robust Open-world Detector (SGROD) method, which can significantly improve the recall of unknown objects without losing the precision on known objects. Specifically, the two challenges in SAM-Guided OWOD include: (1) Noisy labels caused by the class-agnostic nature of SAM; (2) Precision degradation on known objects when more unknown objects are recalled. For the first problem, we propose a dynamic label assignment (DLA) method that adaptively selects confident labels from SAM during training, evidently reducing the noise impact. For the second problem, we introduce cross-layer learning (CLL) and SAM-based negative sampling (SNS), which enable SGROD to avoid precision loss by learning robust decision boundaries of objectness and classification. Experiments on public datasets show that SGROD not only improves the recall of unknown objects by a large margin (~ 20%), but also preserves highly-competitive precision on known objects. The program codes are available at https://github.com/harrylin-hyl/SGROD.

Sniffing Threatening Open-World Objects in Autonomous Driving by Open-Vocabulary Models

OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection

OW-Adapter: Human-Assisted Open-World Object Detection with a Few Examples

A Fusion Method Aiming at Environmental Perception of Autonomous Vehicle Based on Visual Scheme

UNCOVER: Unknown Class Object Detection for Autonomous Vehicles in Real-time

From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects

USD: Unknown Sensitive Detector Empowered by Decoupled Objectness and Segment Anything Model

Recalling Unknowns without Losing Precision: An Effective Solution to Large Model-Guided Open World Object Detection

Open World Object Detection: A Survey

UADet: A Remarkably Simple Yet Effective Uncertainty-Aware Open-Set Object Detection Framework

On the Potential of Open-Vocabulary Models for Object Detection in Unusual Street Scenes

Instance-Dictionary Learning for Open-World Object Detection in Autonomous Driving Scenarios

Detecting the open-world objects with the help of the Brain

Unsupervised Recognition of Unknown Objects for Open-World Object Detection

Addressing the Challenges of Open-World Object Detection

UC-OWOD: Unknown-Classified Open World Object Detection

OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision

Exploring Orthogonality in Open World Object Detection

Object Detectors in the Open Environment: Challenges, Solutions, and Outlook

Opening up Open-World Tracking

SKDF: A Simple Knowledge Distillation Framework for Distilling Open-Vocabulary Knowledge to Open-world Object Detector