Abstract:Unsupervised Domain Adaptation (UDA) has shown promise in Scene Text Recognition (STR) by facilitating knowledge transfer from labeled synthetic text (source) to more challenging unlabeled real scene text (target). However, existing UDA-based STR methods fully rely on the pseudo-labels of target samples, which ignores the impact of domain gaps (inter-domain noise) and various natural environments (intra-domain noise), resulting in poor pseudo-label quality. In this paper, we propose a novel noisy-aware unsupervised domain adaptation framework tailored for STR, which aims to enhance model robustness against both inter- and intra-domain noise, thereby providing more precise pseudo-labels for target samples. Concretely, we propose a reweighting target pseudo-labels by estimating the entropy of refined probability distributions, which mitigates the impact of domain gaps on pseudo-labels. Additionally, a decoupled triple-P-N consistency matching module is proposed, which leverages data augmentation to increase data diversity, enhancing model robustness in diverse natural environments. Within this module, we design a low-confidence-based character negative learning, which is decoupled from high-confidence-based positive learning, thus improving sample utilization under scarce target samples. Furthermore, we extend our framework to the more challenging Source-Free UDA (SFUDA) setting, where only a pre-trained source model is available for adaptation, with no access to source data. Experimental results on benchmark datasets demonstrate the effectiveness of our framework. Under the SFUDA setting, our method exhibits faster convergence and superior performance with less training data than previous UDA-based STR methods. Our method surpasses representative STR methods, establishing new state-of-the-art results across multiple datasets.

TextAdapter: Self-supervised Domain Adaptation for Cross-domain Text Recognition

Attention-based Cross-Layer Domain Alignment for Unsupervised Domain Adaptation

Towards Self-Similarity Consistency and Feature Discrimination for Unsupervised Domain Adaptation.

ProtoUDA: Prototype-based Unsupervised Adaptation for Cross-Domain Text Recognition

Unsupervised Domain Adaptation Via Class Aggregation for Text Recognition

Cross Adversarial Consistency Self-Prediction Learning for Unsupervised Domain Adaptation Person Re-Identification.

Self-Training with Contrastive Learning for Adversarial Domain Adaptation

Cross-domain Contrastive Learning for Unsupervised Domain Adaptation

Unsupervised Domain Adaptation with Adapter

Unsupervised Domain Adaptation for Referring Semantic Segmentation

Robust Text Image Recognition via Adversarial Sequence-to-Sequence Domain Adaptation

Online Unsupervised Domain Adaptation Via Reducing Inter- and Intra-Domain Discrepancies

DOC: Text Recognition Via Dual Adaptation and Clustering

Noisy-Aware Unsupervised Domain Adaptation for Scene Text Recognition

Enhanced Unsupervised Domain Adaptation with Dual-Attention Between Classification and Domain Alignment.

Adversarial Learning and Interpolation Consistency for Unsupervised Domain Adaptation

Cross-domain feature enhancement for unsupervised domain adaptation

Discriminative Feature Adaptation Via Conditional Mean Discrepancy for Cross-Domain Text Classification

Cross-Region Domain Adaptation for Class-level Alignment

Contrastive Learning and Self-Training for Unsupervised Domain Adaptation in Semantic Segmentation

Instance Adaptive Self-training for Unsupervised Domain Adaptation