Self Pseudo Entropy Knowledge Distillation for Semi-supervised Semantic Segmentation
Xiaoqiang Lu,Licheng Jiao,Lingling Li,Fang Liu,Xu Liu,Shuyuan Yang
DOI: https://doi.org/10.1109/tcsvt.2024.3375789
IF: 5.859
2024-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Recently, semi-supervised semantic segmentation methods based on weak-to-strong consistency learning have achieved the most advanced performance. The key to such a technique lies in strong perturbations and multi-objective co-training. However, CutMix, the most commonly used data augmentation in this field, limits the strength of perturbations as it only focuses on single random local context. Besides, complex optimization targets also reduce computational efficiency. In this work, we propose an efficient consistency learning based framework. Specifically, a novel unsupervised data augmentation strategy, EntropyMix, is present for semi-supervised semantic segmentation. Patches of unlabeled data from multi-view augmentations are combined into new training samples based on their prediction entropy, which provides more informative and powerful perturbations for consistency regularization and impels the model to focus on cross-view local context. On this basis, we further propose Self Pseudo Entropy Knowledge Distillation (SPEED) to learn global pixel relations from multi- and cross-view perturbations by optimizing a linear combination of feature-and logit-level distillation loss, enhancing model performance without additional auxiliary segmentation heads or a complex pre-trained teacher model. The collocation of the two ideas above is a plug-and-play technique without additional modification. Extensive experimental results on PASCAL VOC and Cityscapes datasets under various training settings demonstrate the superiority of the proposed data augmentation strategy and self-distillation loss, achieving new state-of-the-art performance. Remarkably, our method reaches mIoU of 75.16% using only 0.87% labeled data on PASCAL VOC and mIoU of 76.98% using only 6.25% labeled data on Cityscapes. The code is available at https://github.com/xiaoqiang-lu/SPEED.
engineering, electrical & electronic