BGRDNet: RGB-D salient object detection with a bidirectional gated recurrent decoding network

Zhengyi Liu,Yuan Wang,Zhili Zhang,Yacheng Tan,Liu, Zhengyi,Wang, Yuan,Zhang, Zhili,Tan, Yacheng
DOI: https://doi.org/10.1007/s11042-022-12799-y
IF: 2.577
2022-03-23
Multimedia Tools and Applications
Abstract:Traditional U-Net framework generates multi-level features by the successive convolution and pooling operations, and then decodes the saliency cue by progressive upsampling and skip connection. The multi-level features are generated from the same input source, but quite different with each other. In this paper, we explore the complementarity among multi-level features, and decode them by Bi-GRU. Since multi-level features are different in the size, we first propose scale adjustment module to organize multi-level features into sequential data with the same channel and resolution. The core unit SAGRU of Bi-GRU is then devised based on self-attention, which can effectively fuse the history and current input. Based on the designed SAGRU, we further present the bidirectional decoding fusion module, which decoding the multi-level features in both down-top and top-down manners. The proposed bidirectional gated recurrent decoding network is applied in the RGB-D salient object detection, which leverages the depth map as a complementary information. Concretely, we put forward depth guided residual module to enhance the color feature. Experimental results demonstrate our method outperforms the state-of-the-art methods in the six popular benchmarks. Ablation studies also verify each module plays an important role.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?