Abstract:Saliency detection has been increasingly gaining research interest in recent years since many computer vision applications need to derive object attentions from images in the first steps. Multi-scale awareness of the saliency detector becomes essential to find thin and small attention regions as well as keeping high-level semantics. In this paper, we propose a novel holistic and deep feature pyramid neural network architecture that can leverage multi-scale semantics in feature encoding stage and saliency region prediction (decoding) stage. In the encoding stage, we exploit multi-scale and pyramidal hierarchy of feature maps via the densely connected network with variable-size dilated convolutions as well as a pyramid pooling. In the decoding stage, we fuse multi-level feature maps via up-sampling and convolution. In addition, we utilize the multi-level deep supervision via plugging in loss functions at every feature fusion level. Multi-loss supervision regularizes weights searching space among different tasks minimizing overfitting and enhances gradient signal during backpropagation, and thus enables us training the network from scratch. This architecture builds an inherent multi-level semantic pyramidal feature maps at different scales and enhances model’s capability in the saliency detection task. We validated our approach on six benchmark datasets and compared with Corresponding authors: Zhifan Gao (gaozhifan@gmail.com) and Heye Zhang (hy.zhang@siat.ac.cn) The National Natural Science Foundation of China (No: 61525106, 61427807,61771464), shenzhen innovation funding (JCYJ20170307165309009, JCYJ20170413114916687,SGLH20161212104605195) c © 2018. The copyright of this document resides with its authors. 2 29TH BRITISH MACHINE VISION CONFERENCE: BMVC2018 eleven state-of-the-art methods. The results demonstrated that the design effectiveness and our approach outperformed the compared methods.

PiCANet: Learning Pixel-wise Contextual Attention in ConvNets and Its Application in Saliency Detection.

PiCANet: Learning Pixel-wise Contextual Attention for Saliency Detection

PiCANet: Pixel-wise Contextual Attention Learning for Accurate Saliency Detection

Learning discriminative context for salient object detection

A Deep Spatial Contextual Long-term Recurrent Convolutional Network for Saliency Detection

Multi-Level Context Aggregation Network with Channel-Wise Attention for Salient Object Detection

CANet: Contextual Information and Spatial Attention Based Network for Detecting Small Defects in Manufacturing Industry

CSA-Net: Deep Cross-Complementary Self Attention and Modality-Specific Preservation for Saliency Detection

PAM: Pyramid Attention Mechanism Based on Contextual Reasoning

Towards accurate RGB-D saliency detection with complementary attention and adaptive integration

Spatially-Aware Context Neural Networks.

Improved U-Net-Like Network for Visual Saliency Detection Based on Pyramid Feature Attention

Pyramid Pixel Context Adaption Network for Medical Image Classification with Supervised Contrastive Learning

Pyramid Feature Attention Network for Saliency detection

Holistic and Deep Feature Pyramids for Saliency Detection.

Salient Positions based Attention Network for Image Classification

SACANet: scene-aware class attention network for semantic segmentation of remote sensing images

Context-aware Graph Label Propagation Network for Saliency Detection.

Global contextual guided residual attention network for salient object detection

SAC-Net: Spatial Attenuation Context for Salient Object Detection

Contextual Encoder-Decoder Network for Visual Saliency Prediction