Domain Adaptation Transformer for Unsupervised Driving-Scene Segmentation in Adverse Conditions

Wenyu Liu,Song Wang,Jianke Zhu,Xuansong Xie,Lei Zhang
DOI: https://doi.org/10.1109/tits.2024.3461468
IF: 8.5
2024-01-01
IEEE Transactions on Intelligent Transportation Systems
Abstract:Semantic segmentation in driving scenarios is important for modern autonomous driving technology. While the existing methods have shown promising results in segmenting normal-condition images, their performance in adverse scenes remains unsatisfactory due to limited visual field and lack of annotation. To address this issue, we propose an unsupervised domain adaptation semantic segmentation method with the transformer architecture, namely ACSegFormer, for driving-scene adverse conditions, aiming at mining image features in visually restricted scenes. Three effective training strategies are proposed in ACSegFormer to learn the latent image context relations and to reduce the gaps between different domains: an entropy-based pseudo label correction scheme that refines the target domain predictions with the normal reference predictions, an optimal transport-based inter-domain alignment module that performs domain alignment on the outputs of transformer encoder, and a masked context learning module that enhances the model's ability to perceive the missing information of target domain image. Our ACSegFormer has no additional training parameters on top of the existing transformer segmentation framework, which can be easily used for self-training-based unsupervised domain adaptation approaches. The experimental results show that our ACSegFormer achieves state-of-the-art performance on driving-scene segmentation benchmarks in adverse conditions, including Dark Zurich and ACDC.
What problem does this paper attempt to address?