Abstract:Saliency detection has been increasingly gaining research interest in recent years since many computer vision applications need to derive object attentions from images in the first steps. Multi-scale awareness of the saliency detector becomes essential to find thin and small attention regions as well as keeping high-level semantics. In this paper, we propose a novel holistic and deep feature pyramid neural network architecture that can leverage multi-scale semantics in feature encoding stage and saliency region prediction (decoding) stage. In the encoding stage, we exploit multi-scale and pyramidal hierarchy of feature maps via the densely connected network with variable-size dilated convolutions as well as a pyramid pooling. In the decoding stage, we fuse multi-level feature maps via up-sampling and convolution. In addition, we utilize the multi-level deep supervision via plugging in loss functions at every feature fusion level. Multi-loss supervision regularizes weights searching space among different tasks minimizing overfitting and enhances gradient signal during backpropagation, and thus enables us training the network from scratch. This architecture builds an inherent multi-level semantic pyramidal feature maps at different scales and enhances model’s capability in the saliency detection task. We validated our approach on six benchmark datasets and compared with Corresponding authors: Zhifan Gao (gaozhifan@gmail.com) and Heye Zhang (hy.zhang@siat.ac.cn) The National Natural Science Foundation of China (No: 61525106, 61427807,61771464), shenzhen innovation funding (JCYJ20170307165309009, JCYJ20170413114916687,SGLH20161212104605195) c © 2018. The copyright of this document resides with its authors. 2 29TH BRITISH MACHINE VISION CONFERENCE: BMVC2018 eleven state-of-the-art methods. The results demonstrated that the design effectiveness and our approach outperformed the compared methods.

Pixel Saliency Based Encoding for Fine-Grained Image Classification.

Efficient Classification Using Salient Regions

Attention cutting and padding learning for fine-grained image recognition

Fine-Grained Image Classification Via Spatial Saliency Extraction.

Accurate salient object detection via dense recurrent connections and residual-based hierarchical feature integration.

Holistic and Deep Feature Pyramids for Saliency Detection.

Improve Neural Network Using Saliency.

Fcn And Unit-Linking Pcnn Based Image Saliency Detection

Emmcnn: An Etps-Based Multi-Scale And Multi-Feature Method Using Cnn For High Spatial Resolution Image Land-Cover Classification

Object-centric Sampling for Fine-grained Image Classification

Hierarchical Spiking-Based Model for Efficient Image Classification with Enhanced Feature Extraction and Encoding.

Subtler mixed attention network on fine-grained image classification

A Fine-Grained Image Classification and Detection Method Based on Convolutional Neural Network Fused with Attention Mechanism

Dual attention guided multi-scale CNN for fine-grained image classification

Selective Sparse Sampling for Fine-Grained Image Recognition

Fine-grained Discriminative Localization via Saliency-guided Faster R-CNN

An Object-Oriented CNN Model Based on Improved Superpixel Segmentation for High-Resolution Remote Sensing Image Classification

Fine-grained image classification method based on hybrid attention module

Pixel-Wise Classification Method For High Resolution Remote Sensing Imagery Using Deep Neural Networks

Hyperspectral image classification method based on narrowing semantic gap convolutional neural network

The Application of Two-Level Attention Models in Deep Convolutional Neural Network for Fine-Grained Image Classification