Abstract:Unsupervised domain adaptation remains a critical challenge in enabling the knowledge transfer of models across unseen domains. Existing methods struggle to balance the need for domain-invariant representations with preserving domain-specific features, which is often due to alignment approaches that impose the projection of samples with similar semantics close in the latent space despite their drastic domain differences. We introduce \mnamelong, a novel approach that shifts the focus from aligning representations in absolute coordinates to aligning the relative positioning of equivalent concepts in latent spaces. \mname defines a domain-agnostic structure upon the semantic/geometric relationships between class labels in language space and guides adaptation, ensuring that the organization of samples in visual space reflects reference inter-class relationships while preserving domain-specific characteristics. %We empirically demonstrate \mname's superiority in domain adaptation tasks across four diverse images and video datasets. Remarkably, \mname surpasses previous works in 18 different adaptation scenarios across four diverse image and video datasets with average accuracy improvements of +3.32% on DomainNet, +5.75% in GeoPlaces, +4.77% on GeoImnet, and +1.94% mean class accuracy improvement on EgoExo4D.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve the key challenges in **Unsupervised Domain Adaptation (UDA)**. Specifically, existing methods have difficulties in balancing **domain - invariant representations** and **preserving domain - specific features**. This is usually because the existing alignment methods force samples with similar semantics to be projected to close positions in the latent space, even though they have significant differences across different domains. #### Main problems: 1. **Differences in domain distributions**: The core problem in UDA is to train a model to generalize in the case of different data distributions. When the data distributions of the source domain and the target domain are different, the performance of the model on the target domain will decline. 2. **Limitations of absolute coordinate alignment**: Traditional UDA methods attempt to align the representation space by minimizing the distribution differences between the source domain and the target domain. However, this method may lead to an overly general representation space and lose the domain - specific detailed features. 3. **Importance of relative positions**: Existing methods ignore the alignment of the relative positions of equivalent concepts in the latent space and focus more on the alignment of absolute coordinates. #### LAGUNA's solutions: LAGUNA (LAnguage Guided UNsupervised Adaptation with structured spaces) proposes a new method that shifts the focus from aligning absolute coordinates to aligning **the relative positions of equivalent concepts in the latent space**. Specifically, LAGUNA solves the problems in the following ways: - **Define domain - independent structures**: Define a domain - independent structure based on the semantic/geometric relationships of class labels in the language space. - **Guided adaptation**: Ensure that the organization of samples in the visual space reflects the inter - class relationships of the reference while preserving domain - specific features. - **Three - stage method**: - **Stage 1**: Construct a language - guided reference structure. - **Stage 2**: Train a language model to provide pseudo - labels for the unlabeled target domain and ensure that the internal text embeddings are aligned with the domain - independent structure defined in Stage 1. - **Stage 3**: Train a cross - domain visual classifier, learn domain - specific action anchors, and make its structure aligned with the domain - independent structure defined in Stage 1. Through this method, LAGUNA can outperform existing methods in multiple different domain adaptation scenarios, with a significant improvement in average accuracy. For example, on four datasets, namely DomainNet, GeoPlaces, GeoImnet, and EgoExo4D, LAGUNA achieved average accuracy improvements of +3.32%, +5.75%, +4.77%, and +1.94% respectively. ### Summary The main contributions of LAGUNA are: 1. Research on the advantages of using relative representations for UDA. 2. Propose LAGUNA, a three - stage method, to learn cross - domain classifiers, making the spaces of the source domain and the target domain both independent and aligned to a language - derived reference structure. 3. Prove the superiority of this method through extensive ablation experiments and comparison with the existing state - of - the - art models. LAGUNA effectively solves the problem of domain distribution differences in UDA by introducing the method of relative position alignment while preserving domain - specific features.

LAGUNA: LAnguage Guided UNsupervised Adaptation with structured spaces

ADeLA: Automatic Dense Labeling with Attention for Viewpoint Shift in Semantic Segmentation

Tell, Don't Show!: Language Guidance Eases Transfer Across Domains in Images and Videos

Multi-Task Domain Adaptation for Language Grounding with 3D Objects

LiOn-XA: Unsupervised Domain Adaptation via LiDAR-Only Cross-Modal Adversarial Training

A Study on Unsupervised Domain Adaptation for Semantic Segmentation in the Era of Vision-Language Models

ADeLA: Automatic Dense Labeling with Attention for Viewpoint Adaptation in Semantic Segmentation

Unsupervised Domain Adaption for High-Resolution Coastal Land Cover Mapping with Category-Space Constrained Adversarial Network

MADAv2: Advanced Multi-Anchor Based Active Domain Adaptation Segmentation

LanDA: Language-Guided Multi-Source Domain Adaptation

Relative Norm Alignment for Tackling Domain Shift in Deep Multi-modal Classification

Compositional Semantic Mix for Domain Adaptation in Point Cloud Segmentation

Pattern Classification Methods for Analysis and Visualization of Brain Perfusion CT Maps

Unsupervised Domain Adaptation with Semantic Consistency Across Heterogeneous Modalities for MRI Prostate Lesion Segmentation

Affinity Space Adaptation for Semantic Segmentation Across Domains

SALUDA: Surface-based Automotive Lidar Unsupervised Domain Adaptation

L3MVN: Leveraging Large Language Models for Visual Target Navigation

GeoAdapt: Self-Supervised Test-Time Adaptation in LiDAR Place Recognition Using Geometric Priors

BiMaL: Bijective Maximum Likelihood Approach to Domain Adaptation in Semantic Scene Segmentation

Fake it, Mix it, Segment it: Bridging the Domain Gap Between Lidar Sensors

Trust And Balance: Few Trusted Samples Pseudo-Labeling and Temperature Scaled Loss for Effective Source-Free Unsupervised Domain Adaptation