Difference-Aware Distillation for Semantic Segmentation
Jianping Gou,Xiabin Zhou,Lan Du,Yibing Zhan,Wu Chen,Zhang Yi
DOI: https://doi.org/10.1109/tmm.2024.3405619
IF: 7.3
2024-10-19
IEEE Transactions on Multimedia
Abstract:In recent years, various distillation methods for semantic segmentation have been proposed. However, these methods typically train the student model to imitate the intermediate features or logits of the teacher model directly, thereby overlooking the high-discrepancy regions learned by both models, particularly the differences in instance edges. In this paper, we introduce a novel approach, called Difference-aware Distillation, to address this limitation. Our proposed method detects the discrepancies among the teacher model and the student model in the logit space through two masking mechanisms (i.e., masking by logit differences with respect to the ground truth labels and masking by differences in the predictive class probabilities), and guides the student model to restore the teacher's features with the focus on these highly-discrepant regions, resulting in improved segmentation performance. With the features jointly masked by these two mechanisms, the student model learns to preserve the teacher's features via a feature generation module, thus achieving better representation. Our experimental evaluation on three datasets, Cityscapes, Pascal2012, and ADE20 K, demonstrates our proposed approach outperforms several baselines considered. Further visualization analysis confirms that our method effectively directs the student model's attention to the discrepancies, such as the edges of small objects and the interiors of large objects.
computer science, information systems,telecommunications, software engineering