On Exploring Shape and Semantic Enhancements for RGB-X Semantic Segmentation

Yuanjian Yang,Caifeng Shan,Fang Zhao,Wenli Liang,Jungong Han
DOI: https://doi.org/10.1109/tiv.2023.3296219
IF: 8.2
2024-01-01
IEEE Transactions on Intelligent Vehicles
Abstract:The robustness of scene segmentation can be enhanced with the aid of other modality information, e.g. , thermal or/and depth, under poor environmental conditions. In this context, RGB-X semantic segmentation is becoming prevalent. Most existing RGB-X semantic segmentation models focus on the fusion strategy between different modalities or between multiple stages, but ignore feature recovery at the decoder side. This makes it difficult to recover the information loss due to downsampling, and also overlooks the pixel connections between segmented objects. To solve these problems, we propose a Shape and Semantic Enhancements Module (SASEM) in this paper, which is characterized by innovations on the decoder side. More specifically, we divide the decoder into a shape supervision branch and a semantic supervision branch. The former reinforces the shape information of the category by using a signed distance map. A multi-stage enhancement structure is designed to further strengthen the shape information of features. The latter directly enhances the semantic extraction capability of the decoder by employing a channel-level semantic enhancement module, which reduces the interference of the semantic information by the shape supervision branch. The two branches work together to enhance the inter-pixel relationship, thus making the decoder more capable of recovering the fused encoded features. Our proposed SASEM serves as an excellent plug-and-play module for different networks, as is evident by the experiments on various RGB-Thermal and RGB-Depth datasets, where our module can be easily integrated and help to improve the performance consistently. The code of our method will be released at: https://github.com/HenonBamboo/SASEM .
What problem does this paper attempt to address?