Visual representations with texts domain generalization for semantic segmentation

Wanlin Yue,Zhiheng Zhou,Yinglie Cao,Weikang Wu
DOI: https://doi.org/10.1007/s10489-023-05125-y
IF: 5.3
2023-11-10
Applied Intelligence
Abstract:At present, Domain generalization for semantic segmentation relying on deep neural networks has made little progress. Most of the current methods are mainly divided into domain randomization, standardization, and whitening. We propose a novel approach to achieve domain generalization for semantic segmentation: leveraging cross-modal information to supervise the model training and improve the generalization ability of the network. We align visual features with textual features in a subspace and enhance the contrast between categories. Our method enables the network to learn rich semantic knowledge from text features and clearer category boundaries. Our experiments also prove that our method can effectively improve the generalization ability of the network. We are the first to exploit multi-modal information for domain-generalized semantic segmentation.
computer science, artificial intelligence
What problem does this paper attempt to address?