DOC: Text Recognition Via Dual Adaptation and Clustering

Xue-Ying Ding,Xiao-Qian Liu,Xin Luo,Xin-Shun Xu
DOI: https://doi.org/10.1109/tmm.2023.3245404
IF: 7.3
2023-01-01
IEEE Transactions on Multimedia
Abstract:More recently, unsupervised domain adaptation has been introduced to text image recognition tasks for serious domain shift problem, which can transfer knowledge from source domains to target ones. Moreover, in unsupervised domain adaptation for text recognition, there is no label information in the target domain to supervise the domain adaptation, especially at the character. Several existing methods regard a text image as a whole and perform only on global feature adaptation, neglecting local-level feature adaptation, i.e., characters. Others methods only focus their attention on word-level feature alignment while ignoring the categories of local-level characters. To address these issues, we propose a text recognition model via Dual adaptatiOn and Clustering, DOC for short. Regarding word-level, we construct a Global Discriminator for global feature adaptation to reduce text layout bias between source and target domains. Regarding character-level, we propose an Adaptive Feature Clustering (AFC) module, which can extract invariant character features through a local-level discriminator for adaptation. Moreover, it enhances the local-feature adaptation by a clustering scheme, which evaluates the feature adaptation by leveraging the knowledge from the source domain as much as possible. In this way, it can pay more attention to the differences in fine-grained characters. Extensive experiments on benchmark datasets demonstrate that our framework can achieve state-of-the-art performance.
What problem does this paper attempt to address?