RGB-D Saliency Detection Via Complementary and Selective Learning

Pan Wenwen,Sun Xiaofei,Qian Yunsheng
DOI: https://doi.org/10.1007/s10489-022-03612-2
IF: 5.3
2022-01-01
Applied Intelligence
Abstract:Previous RGB-D saliency detection methods adopt different fusion schemes to fuse the RGB images and depth maps or their saliency maps. However, both the feature maps from different modalities and the different features within the same maps are not of equal importance. To address this problem, We present a new precise RGB-D saliency detection framework in this work that selectively fuses features of different resolutions from two modalities, considering the global location and local detail complementarity. Depth data contains superior position discrimination, which has been shown to enhance saliency prediction. However, errors or missing areas in a depth map or random distribution along an object boundary will introduce negative effect. Therefore, we design a backbone network and an edge detection module that can select useful representations from RGB images and depth maps with attention mechanism and effectively integrate macroscopic and microscopic features from the two modalities. The accurate location of salient objects with fine edge details is realized by cross-modal selective fusion and complementation. We also propose a triple loss function to improve the credibility of the network for hard sample detection. Extensive quantitative and qualitative evaluation experiments on six benchmark datasets show that our method has a superior performance compared with 11 existing state-of-the-art methods with various evaluation metrics.
What problem does this paper attempt to address?