CSNet: a ConvNeXt-based Siamese network for RGB-D salient object detection

Yunhua Zhang,Hangxu Wang,Gang Yang,Jianhao Zhang,Congjin Gong,Yutao Wang
DOI: https://doi.org/10.1007/s00371-023-02887-x
2023-05-24
Abstract:Global contexts are critical to locating salient objects for salient object detection (SOD). However, the convolution operation in CNNs has a local receptive field, which cannot capture long-distance global information. Recent studies have shown that modernized CNN models with large kernel convolution, such as ConvNeXt, can effectively extend the receptive fields. Based on it, this paper explores the potential of large kernel CNN for SOD task. Inspired by the common information between RGB and depth images in salient objects, we propose a ConvNeXt-based Siamese network with shared weight parameters. This structural design can effectively reduce the number of parameters without sacrificing performance. Furthermore, a depth information preprocessing module is proposed to minimize the impact of low-quality depth images on predicted saliency maps. For cross-modal feature interaction, a dynamic fusion module is designed to enhance cross-modal complementarity dynamically. Extensive experiments and evaluation results on six benchmark datasets demonstrate the outstanding performance of the proposed method against 14 state-of-the-art RGB-D methods. Our code will be released at https://github.com/zyh5119232/CSNet.
computer science, software engineering
What problem does this paper attempt to address?