Abstract:Open-vocabulary detection aims to detect objects from novel categories beyond the base categories on which the detector is trained. However, existing open-vocabulary detectors trained on base category data tend to assign higher confidence to trained categories and confuse novel categories with the background. To resolve this, we propose OV-DQUO, an \textbf{O}pen-\textbf{V}ocabulary DETR with \textbf{D}enoising text \textbf{Q}uery training and open-world \textbf{U}nknown \textbf{O}bjects supervision. Specifically, we introduce a wildcard matching method. This method enables the detector to learn from pairs of unknown objects recognized by the open-world detector and text embeddings with general semantics, mitigating the confidence bias between base and novel categories. Additionally, we propose a denoising text query training strategy. It synthesizes foreground and background query-box pairs from open-world unknown objects to train the detector through contrastive learning, enhancing its ability to distinguish novel objects from the background. We conducted extensive experiments on the challenging OV-COCO and OV-LVIS benchmarks, achieving new state-of-the-art results of 45.6 AP50 and 39.3 mAP on novel categories respectively, without the need for additional training data. Models and code are released at \url{<a class="link-external link-https" href="https://github.com/xiaomoguhz/OV-DQUO" rel="external noopener nofollow">this https URL</a>}

An lstm-ctc based verification system for proxy-word based oov keyword search

Enhancing Out-of-Domain Detection for Speech Spoofing Countermeasure Via Supervised Contrastive Learning

Experimental Investigation into Alignment-based Acoustic Confidence Measures in Keyword Verification for Mandarin Speech

A Novel Discriminative Score Calibration Method for Keyword Search

Open vocabulary keyword spotting through transfer learning from speech synthesis

Improving task independent utterance verification based on on-line garbage phoneme likelihood

Subword scheme for keyword search

Improving keyword search by query expansion in a probabilistic framework

A New Framework for Large Vocabulary Keyword Spotting Using Two-Pass Confidence Measure

Addressing the Out-of-vocabulary Problem for Large-Scale Chinese Spoken Term Detection

Calibration of Word Posterior Estimation in Confusion Networks for Keyword Search

U2-KWS: Unified Two-pass Open-vocabulary Keyword Spotting with Keyword Bias

Acoustic-To-Word Model Without OOV

OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision

Exploiting Noisy Web Data by OOV Ranking for Low-Resource Keyword Search.

Keyword Spotting Based on Hypothesis Boundary Realignment and State-Level Confidence Weighting

Incorporate Web Search Technology to Solve Out-of-Vocabulary Words in Chinese Word Segmentation.

Wrong-of-Thought: An Integrated Reasoning Framework with Multi-Perspective Verification and Wrong Information

An utterance verification algorithm in keyword spotting system

Vision-Language Adaptive Mutual Decoder for OOV-STR

A Novel Keyword Verification Algorithm