Abstract:Recent years have witnessed growing interests in RGB-D Salient Object Detection (SOD), benefiting from the ample spatial layout cues embedded in depth maps to help SOD models distinguish salient objects from complex backgrounds or similar surroundings. Despite these progresses, this emerging line of research has been considerably hindered by the noise and ambiguity that prevail in raw depth images, as well as the coarse object boundaries in saliency predictions. To address the aforementioned issues, we propose a D epth C alibration and B oundary-aware F usion ( DCBF ) framework that contains two novel components: (1) a learning strategy to calibrate the latent bias in the original depth maps towards boosting the SOD performance; (2) a boundary-aware multimodal fusion module to fuse the complementary cues from RGB and depth channels, as well as to improve object boundary qualities. In addition, we introduce a new saliency dataset, HiBo-UA, which contains 1515 high-resolution RGB-D images with finely-annotated pixel-level labels. To our best knowledge, this is the first RGB-D-based high-resolution saliency dataset with significantly higher image resolution (nearly 7 ) than the widely used DUT-D dataset. The proposed high-resolution dataset with richer object boundary details is capable of accurately assessing the performance of various saliency models, in order to retain fine-grained object boundaries. It also facilitates the growing need of our research community in accessing higher-resolution data. Extensive empirical experiments demonstrate the superior performance of our approach against 31 state-of-the-art methods. It is worth noting that our calibrated depth alone can work in a plug-and-play manner; empirically it is shown to bring noticeable improvements when applied to existing state-of-the-art RGB-D SOD models.

Depth-guided Deformable Convolutions for RGB-D Saliency Object Detection

Depth Cue Enhancement and Guidance Network for RGB-D Salient Object Detection

Coordinate Attention Filtering Depth-Feature Guide Cross-Modal Fusion RGB-Depth Salient Object Detection

Calibrated RGB-D Salient Object Detection

RGB-D Salient Object Detection via 3D Convolutional Neural Networks

Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion

Delving into Calibrated Depth for Accurate RGB-D Salient Object Detection

RGB-D Saliency Detection via Depth Quality Perception and Hierarchical Feature Guidance

Global-prior-guided fusion network for salient object detection

Is Depth Really Necessary for Salient Object Detection?

RGB-D Salient Object Detection Method Based on Multi-Modal Fusion and Contour Guidance

Double Cross-Modality Progressively Guided Network for RGB-D Salient Object Detection

Salient Object Detection for RGBD Video Via Spatial Interaction and Depth-Based Boundary Refinement

Depth-Induced Gap-Reducing Network for RGB-D Salient Object Detection: an Interaction, Guidance and Refinement Approach

Advancing in RGB-D Salient Object Detection: A Survey

Depth Quality-Aware Selective Saliency Fusion for RGB-D Image Salient Object Detection

Learnable Depth-Sensitive Attention for Deep RGB-D Saliency Detection with Multi-modal Fusion Architecture Search

RGB-Guided Depth Feature Enhancement for RGB–Depth Salient Object Detection

Progressive multi-scale fusion network for RGB-D salient object detection

Cross-modal refined adjacent-guided network for RGB-D salient object detection

Deep Feature Filtering and Contextual Information Gathering Network for RGB-D Salient Object Detection