Abstract:The key to OOD detection has two aspects: generalized feature representation and precise category description. Recently, vision-language models such as CLIP provide significant advances in both two issues, but constructing precise category descriptions is still in its infancy due to the absence of unseen categories. This work introduces two hierarchical contexts, namely perceptual context and spurious context, to carefully describe the precise category boundary through automatic prompt tuning. Specifically, perceptual contexts perceive the inter-category difference (e.g., cats vs apples) for current classification tasks, while spurious contexts further identify spurious (similar but exactly not) OOD samples for every single category (e.g., cats vs panthers, apples vs peaches). The two contexts hierarchically construct the precise description for a certain category, which is, first roughly classifying a sample to the predicted category and then delicately identifying whether it is truly an ID sample or actually OOD. Moreover, the precise descriptions for those categories within the vision-language framework present a novel application: CATegory-EXtensible OOD detection (CATEX). One can efficiently extend the set of recognizable categories by simply merging the hierarchical contexts learned under different sub-task settings. And extensive experiments are conducted to demonstrate CATEX's effectiveness, robustness, and category-extensibility. For instance, CATEX consistently surpasses the rivals by a large margin with several protocols on the challenging ImageNet-1K dataset. In addition, we offer new insights on how to efficiently scale up the prompt engineering in vision-language models to recognize thousands of object categories, as well as how to incorporate large language models (like GPT-3) to boost zero-shot applications. Code will be made public soon.

Improved Out-of-Scope Intent Classification with Dual Encoding and Threshold-based Re-Classification

A new approach for fine-tuning sentence transformers for intent classification and out-of-scope detection tasks

Out-of-Scope Intent Detection with Self-Supervision and Discriminative Training

Revisiting Mahalanobis Distance for Transformer-Based Out-of-Domain Detection

Few-shot out-of-scope intent classification: analyzing the robustness of prompt-based learning

A Hybrid Architecture for Out of Domain Intent Detection and Intent Discovery

Out-of-Scope Domain and Intent Classification through Hierarchical Joint Modeling

Towards Open Environment Intent Prediction.

Real-time Caller Intent Detection In Human-Human Customer Support Spoken Conversations

Are Pretrained Transformers Robust in Intent Classification? A Missing Ingredient in Evaluation of Out-of-Scope Intent Detection

Class Lifelong Learning for Intent Detection via Structure Consolidation Networks

Intent Detection in the Age of LLMs

Category-Extensible Out-of-Distribution Detection via Hierarchical Context Descriptions

Unified Classification and Rejection: A One-versus-All Framework

Dual-oriented Disentangled Network with Counterfactual Intervention for Multimodal Intent Detection

A Post-Processing Method for Detecting Unknown Intent of Dialogue System Via Pre-Trained Deep Neural Network Classifier

Open Intent Extraction from Natural Language Interactions

Energy-based Unknown Intent Detection with Data Manipulation

Open Intent Extraction from Natural Language Interactions (Extended Abstract)

Generating Hard-Negative Out-of-Scope Data with ChatGPT for Intent Classification

Leveraging Acoustic and Linguistic Embeddings from Pretrained speech and language Models for Intent Classification