CMT: Cross Mean Teacher Unsupervised Domain Adaptation for VHR Image Semantic Segmentation

Liang Yan,Bin Fan,Shiming Xiang,Chunhong Pan
DOI: https://doi.org/10.1109/lgrs.2021.3065982
IF: 5.343
2021-01-01
IEEE Geoscience and Remote Sensing Letters
Abstract:Semantic segmentation of remote sensing images has achieved superior results with the supervised deep learning models. However, their performance to unseen data domains could be very bad due to the domain shift between different domains. Recently, a series of unsupervised domain adaptation (UDA) methods has been developed to solve the domain shift problem in semantic segmentation. Most of them use adversarial learning to achieve global cross-domain alignment and use a self-training (ST) strategy to generate pseudo-labels for classwise alignment. However, these methods ignore the pixels that are not assigned pseudo-labels. Those pixels are mostly at the boundaries, which are vital to the final segmentation results. To solve this problem, this letter proposes a cross mean teacher (CMT) UDA method. The whole framework consists of two parts. On the one hand, the global cross-domain distribution alignment is performed, and then, reliable pseudo-labels are assigned to the target data. On the other hand, a cross teacher–student network (CTSN) is developed to effectively use those pixels with and without pseudo-labels. This network contains two student networks ( $S_{1}$ and $S_{2}$ ) and two teacher networks ( $T_{1}$ and $T_{2}$ ) for cross-consistency constraints that supervises $S_{2}$ (or $S_{1}$ ) by the prediction results of $T_{1}$ (or $T_{2}$ ). The cross supervision by CTSN is helpful to prevent performance bottlenecks caused by the high coupling of teacher–student network in existing methods. Extensive experiments on three different remote sensing adaptation scenes verify the effectiveness and superiority of the proposed method.
What problem does this paper attempt to address?