Unified Language-driven Zero-shot Domain Adaptation

Senqiao Yang,Zhuotao Tian,Li Jiang,Jiaya Jia

2024-04-11

Abstract:This paper introduces Unified Language-driven Zero-shot Domain Adaptation (ULDA), a novel task setting that enables a single model to adapt to diverse target domains without explicit domain-ID knowledge. We identify the constraints in the existing language-driven zero-shot domain adaptation task, particularly the requirement for domain IDs and domain-specific models, which may restrict flexibility and scalability. To overcome these issues, we propose a new framework for ULDA, consisting of Hierarchical Context Alignment (HCA), Domain Consistent Representation Learning (DCRL), and Text-Driven Rectifier (TDR). These components work synergistically to align simulated features with target text across multiple visual levels, retain semantic correlations between different regional representations, and rectify biases between simulated and real target visual features, respectively. Our extensive empirical evaluations demonstrate that this framework achieves competitive performance in both settings, surpassing even the model that requires domain-ID, showcasing its superiority and generalization ability. The proposed method is not only effective but also maintains practicality and efficiency, as it does not introduce additional computational costs during inference. Our project page is

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to address some key limitations in Zero-shot Domain Adaptation, particularly the issues of model flexibility and scalability in the absence of target domain data in practical applications. Specifically: 1. **Limitations of existing methods**: - Existing language-driven zero-shot domain adaptation methods (such as PØDA) require specific domain IDs to select the corresponding model, which may limit the model's flexibility and scalability in practical applications. - They rely on specific domain data for fine-tuning during training, which may not be directly accessible in some cases due to privacy or data scarcity. 2. **Proposed new task setting**: - The paper introduces a new task setting—Unified Language-driven Zero-shot Domain Adaptation (ULDA), which allows a single model to adapt to multiple different target domains at test time without explicit domain IDs. - ULDA utilizes only source domain data and textual descriptions of the target domain for training, thus avoiding the need for direct access to target domain images. 3. **Methods to address new challenges**: - To overcome the above challenges, the authors propose a new framework comprising three main components: Hierarchical Context Alignment (HCA), Domain Consistent Representation Learning (DCRL), and Text-Driven Rectifier (TDR). - These components work together to align simulated features with target texts at multiple visual levels, preserve semantic correlations between different region representations, and correct biases between simulated features and real target visual features. Through these methods, the paper demonstrates that the proposed framework not only performs well under traditional settings but also shows competitive performance under the new ULDA setting, proving its superiority and generalization capability.

Unified Language-driven Zero-shot Domain Adaptation

Attention-based Cross-Layer Domain Alignment for Unsupervised Domain Adaptation

A New Bidirectional Unsupervised Domain Adaptation Segmentation Framework

Fine-grained Representation Alignment for Zero-shot Domain Adaptation

Combining inherent knowledge of vision-language models with unsupervised domain adaptation through strong-weak guidance

Unsupervised Domain Adaptation Approach for Vision-Based Semantic Understanding of Bridge Inspection Scenes Without Manual Annotations

Unsupervised Domain Adaptation with Unified Joint Distribution Alignment

Domain consensual contrastive learning for few-shot universal domain adaptation

CLIP the Divergence: Language-guided Unsupervised Domain Adaptation

Zero-Shot Deep Domain Adaptation

Domain-Aware Continual Zero-Shot Learning

Multi-Task Domain Adaptation for Language Grounding with 3D Objects

Adapt in Contexts: Retrieval-Augmented Domain Adaptation via In-Context Learning

Domain Adaptation for Underwater Image Enhancement

Unsupervised Domain Adaptation for Video Object Grounding with Cascaded Debiasing Learning

HyUniDA: Breaking Label Set Constraints for Universal Domain Adaptation in Cross-Scene Hyperspectral Image Classification

Zero-shot domain adaptation based on dual-level mix and contrast

Domain Adaptation In Reinforcement Learning Via Latent Unified State Representation

Unsupervised Domain Adaptation with Joint Domain-Adversarial Reconstruction Networks

Dual similarity pre-training and domain difference encouragement learning for vehicle re-identification in the wild.

Universal Semi-Supervised Domain Adaptation by Mitigating Common-Class Bias