2.5d Convolution For Rgb-D Semantic Segmentation

Yajie Xing,Jingbo Wang,Xiaokang Chen,Gang Zeng
DOI: https://doi.org/10.1109/icip.2019.8803757
2019-01-01
Abstract:Convolutional neural networks (CNN) have achieved great success in RGB semantic segmentation. RGB-D images provide additional depth information, which can improve segmentation performance. To take full advantages of the 3D geometry relations provided by RGB-D images, in this paper, we propose 2.5D convolution, which mimics one 3D convolution kernel by several masked 2D convolution kernels. Our 2.5D convolution can effectively process spatial relations between pixels in a manner similar to 3D convolution while still sampling pixels on 2D plane, and thus saves computational cost. And it can be seamlessly incorporated into pretrained CNNs. Experiments on two challenging RGB-D semantic segmentation benchmarks NYUDv2 and SUN-RGBD validate the effectiveness of our approach.
What problem does this paper attempt to address?