Abstract:Unsupervised Domain Adaptation (UDA) aims to learn a classifier for the unlabeled target domain by leveraging knowledge from a labeled source domain with a different but related distribution. Many existing approaches typically learn a domain-invariant representation space by directly matching the marginal distributions of the two domains. However, they ignore exploring the underlying discriminative features of the target data and align the cross-domain discriminative features, which may lead to suboptimal performance. To tackle these two issues simultaneously, this paper presents a Joint Clustering and Discriminative Feature Alignment (JCDFA) approach for UDA, which is capable of naturally unifying the mining of discriminative features and the alignment of class-discriminative features into one single framework. Specifically, in order to mine the intrinsic discriminative information of the unlabeled target data, JCDFA jointly learns a shared encoding representation for two tasks: supervised classification of labeled source data, and discriminative clustering of unlabeled target data, where the classification of the source domain can guide the clustering learning of the target domain to locate the object category. We then conduct the cross-domain discriminative feature alignment by separately optimizing two new metrics: 1) an extended supervised contrastive learning, i.e., semi-supervised contrastive learning 2) an extended Maximum Mean Discrepancy (MMD), i.e., conditional MMD, explicitly minimizing the intra-class dispersion and maximizing the inter-class compactness. When these two procedures, i.e., discriminative features mining and alignment are integrated into one framework, they tend to benefit from each other to enhance the final performance from a cooperative learning perspective. Experiments are conducted on four real-world benchmarks (e.g., Office-31, ImageCLEF-DA, Office-Home and VisDA-C)- All the results demonstrate that our JCDFA can obtain remarkable margins over state-of-the-art domain adaptation methods. Comprehensive ablation studies also verify the importance of each key component of our proposed algorithm and the effectiveness of combining two learning strategies into a framework.

Unsupervised Domain Adaptation for Video Object Grounding with Cascaded Debiasing Learning

Video Unsupervised Domain Adaptation with Deep Learning: A Comprehensive Survey

Unsupervised Domain Adaptation Approach for Vision-Based Semantic Understanding of Bridge Inspection Scenes Without Manual Annotations

Multi-Modal Domain Adaptation Across Video Scenes for Temporal Video Grounding

GrabDAE: An Innovative Framework for Unsupervised Domain Adaptation Utilizing Grab-Mask and Denoise Auto-Encoder

Unified Domain Adaptive Semantic Segmentation

Proposal-Level Unsupervised Domain Adaptation for Open World Unbiased Detector

MoDA: Leveraging Motion Priors from Videos for Advancing Unsupervised Domain Adaptation in Semantic Segmentation

Video domain adaptation for semantic segmentation using perceptual consistency matching

Unsupervised Domain Adaption Harnessing Vision-Language Pre-training

Joint Clustering and Discriminative Feature Alignment for Unsupervised Domain Adaptation

Uncertainty-Aware Unsupervised Domain Adaptation in Object Detection

Deep Unsupervised Domain Adaptation: A Review of Recent Advances and Perspectives

Spatio-Temporal Pixel-Level Contrastive Learning-based Source-Free Domain Adaptation for Video Semantic Segmentation

Source-free Domain Adaptation Via Dynamic Pseudo Labeling and Self-supervision

Multiview Latent Space Learning with Progressively Fine-tuned Deep Features for Unsupervised Domain Adaptation

Unsupervised domain adaptation with weak source domain labels via bidirectional subdomain alignment

Decomposition-based Unsupervised Domain Adaptation for Remote Sensing Image Semantic Segmentation

Cross-domain video action recognition via adaptive gradual learning

Domain-Augmented Domain Adaptation