Abstract:Although deep learning-based techniques for salient object detection have considerably improved over recent years, estimated saliency maps still exhibit imprecise predictions owing to the internal complexity and indefinite boundaries of salient objects of varying sizes. Existing methods emphasize the design of an exemplary structure to integrate multi-level features by employing multi-scale features and attention modules to filter salient regions from cluttered scenarios. We propose a saliency detection network based on three novel contributions. First, we use a dense feature extraction unit (DFEU) by introducing large kernels of asymmetric and grouped-wise convolutions with channel reshuffling. The DFEU extracts semantically enriched features with large receptive fields and reduces the gridding problem and parameter sizes for subsequent operations. Second, we suggest a cross-feature integration unit (CFIU) that extracts semantically enriched features from their high resolutions using dense short connections and sub-samples the integrated information into different attentional branches based on the inputs received for each stage of the backbone. The embedded independent attentional branches can observe the importance of the sub-regions for a salient object. With the constraint-wise growth of the sub-attentional branches at various stages, the CFIU can efficiently avoid global and local feature dilution effects by extracting semantically enriched features via dense short-connections from high and low levels. Finally, a contour-aware saliency refinement unit (CSRU) was devised by blending the contour and contextual features in a progressive dense connected fashion to assist the model toward obtaining more accurate saliency maps with precise boundaries in complex and perplexing scenarios. Our proposed model was analyzed with ResNet-50 and VGG-16 and outperforms most contemporary techniques with fewer parameters.

MPLA-Net: Multiple Pseudo Label Aggregation Network for Weakly Supervised Video Salient Object Detection

Spatial Likelihood Voting with Self-Knowledge Distillation for Weakly Supervised Object Detection.

SLV: Spatial Likelihood Voting for Weakly Supervised Object Detection

WUSL–SOD: Joint Weakly Supervised, Unsupervised and Supervised Learning for Salient Object Detection

A Weakly Supervised Learning Framework for Salient Object Detection via Hybrid Labels

Weakly Supervised Salient Object Detection Using Image Labels

Noise-Sensitive Adversarial Learning for Weakly Supervised Salient Object Detection

Saliency Guided End-to-End Learning for Weakly Supervised Object Detection.

Saliency Guided End-to-end Learning Forweakly Supervised Object Detection

Weakly Supervised Video Salient Object Detection via Point Supervision

MOL: Towards Accurate Weakly Supervised Remote Sensing Object Detection Via Multi-view Noisy Learning

Cyclic-Bootstrap Labeling for Weakly Supervised Object Detection

Multi-Granularity Denoising and Bidirectional Alignment for Weakly Supervised Semantic Segmentation

AWANet: Attentive-Aware Wide-Kernels Asymmetrical Network with Blended Contour Information for Salient Object Detection

Weakly supervised salient object detection via bounding-box annotation and SAM model

PCL: Proposal Cluster Learning for Weakly Supervised Object Detection

Complementary characteristics fusion network for weakly supervised salient object detection

Category-Aware Saliency Enhance Learning Based on CLIP for Weakly Supervised Salient Object Detection

Video Salient Object Detection via Fully Convolutional Networks

Refining and reweighting pseudo labels for weakly supervised object detection

A novel seminar learning framework for weakly supervised salient object detection