RGB-D salient object detection via cross-modal joint feature extraction and low-bound fusion loss

Xinxin Zhu,Yi Li,Huazhu Fu,Xiaoting Fan,Yanan Shi,Jianjun Lei
DOI: https://doi.org/10.1016/j.neucom.2020.05.110
IF: 6
2021-09-01
Neurocomputing
Abstract:<p>RGB-D salient object detection aims at identifying attractive objects in a scene by combining the color image and depth map. However, due to the differences between RGB-D image pairs, it is a key issue to utilize cross-modal data effectively. In this paper, we propose a novel RGB-D salient object detection method via cross-modal joint feature extraction and low-bound fusion loss. A two-stream framework is designed to generate the saliency maps for the RGB image and depth map. During the feature extraction, a cross-modal joint feature extraction module (CFM) is proposed to capture valuable joint features from the two streams. The CFM explores complementary information from the feature extraction and feeds the joint features to the aggregation stage of the network. Then, the fusion block (FB) is utilized to aggregate the multi-scale features of each stream and the joint features to generate the updated features. In addition, a low-bound fusion loss is designed to constrain the predictions of the two streams, to improve the lower bound of saliency values and generate a distinct saliency map. Experimental results on five datasets demonstrate that the proposed method achieves superior performances.</p>
computer science, artificial intelligence
What problem does this paper attempt to address?