Addressing Domain Gap Via Content Invariant Representation for Semantic Segmentation

Li Gao,Lefei Zhang,Qian Zhang
DOI: https://doi.org/10.1609/aaai.v35i9.16922
2021-01-01
Abstract:The problem of unsupervised domain adaptation in semantic segmentation is a major challenge for numerous computer vision tasks because acquiring pixel-level labels is timeconsuming with expensive human labor. A large gap exists among data distributions in different domains, which will cause severe performance loss when a model trained with synthetic data is generalized to real data. Hence, we propose a novel domain adaptation approach, called Content Invariant Representation Network, to narrow the domain gap between the source (S) and target (T ) domains. The previous works developed a network to directly transfer the knowledge from the S to T . On the contrary, the proposed method aims to progressively reduce the gap between S and T on the basis of a Content Invariant Representation (CIR). CIR is an intermediate domain (I) sharing invariant content with S and having similar data distribution to T . Then, an Ancillary Classifier Module (ACM) is designed to focus on pixel-level details and generate attention-aware results. ACM adaptively assigns different weights to pixels according to their domain offsets, thereby reducing local domain gaps. The global domain gap between CIR and T is also narrowed by enforcing local alignments. Last, we perform self-supervised training in the pseudo-labeled target domain to further fit the distribution of the real data. Comprehensive experiments on two domain adaptation tasks, that is, GTAV → Cityscapes and SYNTHIA → Cityscapes, clearly demonstrate the superiority of our method compared with state-of-the-art methods.
What problem does this paper attempt to address?