A Domain Adaptive Semantic Segmentation Method Using Contrastive Learning and Data Augmentation

Yixiao Xiang,Lihua Tian,Chen Li
DOI: https://doi.org/10.1007/s11063-024-11529-9
IF: 2.565
2024-02-27
Neural Processing Letters
Abstract:For semantic segmentation tasks, it is expensive to get pixel-level annotations on real images. Domain adaptation eliminates this process by transferring networks trained on synthetic images to real-world images. As one of the mainstream approaches to domain adaptation, most of the self-training based domain adaptive methods focus on how to select high confidence pseudo-labels, i.e., to obtain domain invariant knowledge indirectly. A more direct means to explicitly align the data of the source and target domains globally and locally is lacking. Meanwhile, the target features obtained by traditional self-training methods are relatively scattered and cannot be aggregated in a relatively compact space. We offer an approach that utilizes data augmentation and contrastive learning in this paper to perform more effective knowledge migration with the basis of self-training. Specifically, the style migration and image mixing modules are first introduced for data augmentation to cope with the problem of large domain gaps in the source and target domains. To assure the aggregation of features from the same class and the discriminability of features from other classes during the training process, we propose a multi-scale pixel-level contrastive learning module. What's more, a cross-scale contrastive learning module is proposed to help each level of the model gain the capability to obtain more information on the basis of its own original task. Experiments show that our final trained model can effectively classify the images from target domain.
computer science, artificial intelligence
What problem does this paper attempt to address?
### Problems Addressed by the Paper This paper primarily addresses the issue of domain adaptation in semantic segmentation tasks. Specifically, the paper aims to solve the following aspects: 1. **Domain Adaptation**: - In semantic segmentation tasks, obtaining pixel-level annotations for real images is very expensive. Through domain adaptation methods, the trained network on synthetic images can be transferred to real-world images, thereby eliminating the need for costly real image annotations. - Traditional self-training methods can acquire some domain-invariant knowledge, but these methods are not direct enough in feature alignment, resulting in relatively scattered target domain features that cannot be aggregated in a compact space. 2. **Feature Alignment and Discriminability**: - A multi-scale pixel-level contrastive learning module is proposed to ensure that features of the same category in both the source and target domains can cluster better and improve the discriminability between different category features. - This helps in knowledge transfer and further enhances the accuracy of semantic segmentation. 3. **Data Augmentation**: - Introduced style transfer and image mixing modules to handle data augmentation, addressing the significant domain gap between the source and target domains. - Used mathematical transformation methods for style transfer to make the generated images closer to the style of the target domain while maintaining semantic consistency between the generated images and the original images. - Improved the Classmix module, enabling the model to more effectively capture inter-class contextual relationships and increase the model's generalization ability. In summary, this paper proposes a method that combines contrastive learning and data augmentation to address the domain adaptation problem in semantic segmentation tasks, improving the model's performance in the target domain.