Abstract:The effectiveness of depth information in saliency detection has been fully proved. However, it is still worth exploring how to utilize the depth information more efficiently. Erroneous depth information may cause detection failure, while non-salient objects may be closer to the camera which also leads to erroneously emphasis on non-salient regions. Moreover, most of the existing RGB-D saliency detection models have poor robustness when the salient object touches the image boundaries. To mitigate these problems, we propose a multi-stage saliency detection model with the bilateral absorbing Markov chain guided by depth information. The proposed model progressively extracts the saliency cues with three level (low-, mid-, and high-level) stages. First, we generate low-level saliency cues by explicitly combining color and depth information. Then, we design a bilateral absorbing Markov chain to calculate mid-level saliency maps. In mid-level, to suppress boundary touch problem, we present the background seed screening mechanism (BSSM) for improving the construction of the two-layer sparse graph and better selecting background-based absorbing nodes. Furthermore, the cross-modal multi-graph learning model (CMLM) is designed to fully explore the intrinsic complementary relationship between color and depth information. Finally, to obtain a more highlighted and homogeneous saliency map in high-level, we structure a depth-guided optimization module which combines cellular automata and suppression-enhancement function pair. This optimization module refines the saliency map in color space and depth space, respectively. Comprehensive experiments on three challenging benchmark datasets demonstrate the effectiveness of our proposed method both qualitatively and quantitatively.

3D Layout Encoding Network for Spatial-Aware 3D Saliency Modelling

Learning Stereoscopic Visual Attention Model for 3d Video

A Depth Estimation Framework Based on Unsupervised Learning and Cross-Modal Translation

LA-Net: Layout-Aware Dense Network for Monocular Depth Estimation.

Depth Cue Enhancement and Guidance Network for RGB-D Salient Object Detection

Saliency Detection with Bilateral Absorbing Markov Chain Guided by Depth Information

Edge-Semantic Learning Strategy for Layout Estimation in Indoor Environment

LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth Rendering

Exploring Deep 3D Spatial Encodings for Large-Scale 3D Scene Understanding

Accurate Saliency Detection Based on Depth Feature of 3D Images

TCANet: three-stream coordinate attention network for RGB-D indoor semantic segmentation

3D Room Layout Estimation from a Single RGB Image.

A Novel Saliency Model for Stereoscopic Images.

Exploring Hierarchical Spatial Layout Cues for 3D Point Cloud based Scene Graph Prediction

A new representation of scene layout improves saliency detection in traffic scenes

EF-Net: A Novel Enhancement and Fusion Network for RGB-D Saliency Detection

SE3D: A Framework For Saliency Method Evaluation In 3D Imaging

A Computational Model for Stereoscopic Visual Saliency Prediction

A Novel Edge-Inspired Depth Quality Evaluation Network for RGB-D Salient Object Detection

Edge-Aware Spatial Propagation Network for Multi-view Depth Estimation

Explicit3D: Graph Network with Spatial Inference for Single Image 3D Object Detection