Cascading Context Enhancement Network for RGB-D Semantic Segmentation

Xu Tang,Zejun Zhang,Yan Meng,Jianxiao Xie,Changbing Tang,Weichuan Zhang
DOI: https://doi.org/10.1007/s11042-024-19110-1
IF: 2.577
2024-01-01
Multimedia Tools and Applications
Abstract:Relying solely on local features makes it difficult to fully understand an image’s global structure and fine-grained details, and context information is crucial for semantic segmentation, as it can provide rich target clues, especially in the deep stages with abundant contextual information. However, most existing methods only align two modal features and do not consider the vital role of deep contextual information. In order to comprehensively analyze the global structure and fine-grained details of the image, a multi-head context aggregation module is proposed to adaptively calculate the context aggregation relationship between low and high levels of RGB context information to enhance spatial structure information, and a local-global context channel enhancement module is presented to learn the local-global context to enhance the depth channel representation. The proposed CCENet is compared with recent methods on theNYUDv2 and SUN RGB-D datasets, demonstrating that our model achieves state-of-the-art performance.
What problem does this paper attempt to address?