Non-Local Aggregation for RGB-D Semantic Segmentation

Guodong Zhang,Jing-Hao Xue,Pengwei Xie,Sifan Yang,Guijin Wang
DOI: https://doi.org/10.1109/lsp.2021.3066071
2021-01-01
IEEE Signal Processing Letters
Abstract:Exploiting both RGB (2D appearance) and Depth (3D geometry) information can improve the performance of semantic segmentation. However, due to the inherent difference between the RGB and Depth information, it remains a challenging problem in how to integrate RGB-D features effectively. In this letter, to address this issue, we propose a Non-local Aggregation Network (NANet), with a well-designed Multi-modality Non-local Aggregation Module (MNAM), to better exploit the non-local context of RGB-D features at multi-stage. Compared with most existing RGB-D semantic segmentation schemes, which only exploit local RGB-D features, the MNAM enables the aggregation of non-local RGB-D information along both spatial and channel dimensions. The proposed NANet achieves comparable performances with state-of-the-art methods on popular RGB-D benchmarks, NYUDv2 and SUN-RGBD.
What problem does this paper attempt to address?