Abstract:4D light field data record the scene from multiple views, thus implicitly providing beneficial depth cue for salient object detection in challenging scenes. Existing light field salient object detection (LF SOD) methods usually use a large number of views to improve the detection accuracy. However, using so many views for LF SOD brings difficulties to its practical applications. Considering that adjacent views in a light field are actually with very similar contents, in this work, we propose defining a more efficient pattern of input views, i.e., key sparse views, and design a network to effectively explore the depth cue from sparse views for LF SOD. Specifically, we firstly introduce a low rank-based statistical analysis to the existing LF SOD datasets, which allows us to conclude a fixed yet universal pattern for our key sparse views, including the number and positions of views. These views maintain the sufficient depth cue, but greatly lower the number of views to be captured and processed, facilitating practical applications. Then, we propose an effective solution with a key Complementary and Discriminative Interaction Module (CDIM) for LF SOD from key sparse views, named CDINet. The CDINet follows a two-stream structure to extract the depth cue from the light field stream (i.e., sparse views) and the appearance cue from the RGB stream (i.e., center view), generating features and initial saliency maps for each stream. The CDIM is tailored for inter-stream interaction of both these features and saliency maps, using the depth cue to complement the missing salient regions in RGB stream and discriminate the background distraction, to enhance the final saliency map further. Extensive experiments on three LF multi-view datasets demonstrate that our CDINet not only outperforms the state-of-the-art 2D methods, but also achieves competitive performance as compared with the state-of-the-art 3D and 4D methods. The code and results of our method are available at https://github.com/GilbertRC/LFSOD-CDINet.

Spatial Attention-Guided Light Field Salient Object Detection Network with Implicit Neural Representation

A Learning-Based Method Using Data Augmentation for Light Field Salient Object Detection

Light Field Salient Object Detection with Sparse Views via Complementary and Discriminative Interaction Network

Rethinking Feature Mining for Light Field Salient Object Detection

LRNet: lightweight attention-oriented residual fusion network for light field salient object detection

Salient Object Detection Based on Visual Perceptual Saturation and Two-Stream Hybrid Networks.

SLMSF-Net: A Semantic Localization and Multi-Scale Fusion Network for RGB-D Salient Object Detection

Focal stack based light field salient object detection via 3D–2D convolution hybrid network

Learning Synergistic Attention for Light Field Salient Object Detection

ARFNet: Attention-Oriented Refinement and Fusion Network for Light Field Salient Object Detection

Localization, balance and affinity: a stronger multifaceted collaborative salient object detector in remote sensing images

Global-prior-guided fusion network for salient object detection

Parallax-Aware Network for Light Field Salient Object Detection

EMNet: Edge-guided multi-level network for salient object detection in low-light images

LF Tracy: A Unified Single-Pipeline Approach for Salient Object Detection in Light Field Cameras

A novel seminar learning framework for weakly supervised salient object detection

An adaptive guidance fusion network for RGB-D salient object detection

Guided Focal Stack Refinement Network for Light Field Salient Object Detection

Spatial Frequency Enhanced Salient Object Detection

HFMDNet: Hierarchical Fusion and Multilevel Decoder Network for RGB-D Salient Object Detection

Multi-scale Feature Aggregation Network for Salient Object Detection in Optical Remote Sensing Images