Enriched Feature Representation and Combination for Deep Saliency Detection

Lecheng Zhou,Xiaodong Gu
DOI: https://doi.org/10.1007/978-3-030-61609-0_55
2020-01-01
Abstract:One of the most challenging issue in visual saliency detection is to discover and integrate meaningful features through deep neural networks. Saliency detection model should be carefully designed to extract sufficient features from different levels and reorganize them into the final prediction. In this paper, we propose an efficient saliency detection framework by introducing multi-scale representation and multi-level combination to deep convolutional neural networks. The main idea of our proposed model is to optimize intra-level feature extraction and inter-level feature combination, so that both saliency semantic and object details can be correctly preserved in final saliency maps. The model utilizes parallel dilated convolutions and pyramid pooling structures to enhance local details and acquire multi-scale feature representation. Feature maps of different resolutions are integrated by performing hierarchical combination in the encoder and decoder parts respectively. As a result, the model can better retain detail information during feature extraction and locate salient regions for saliency map recovery. Experimental results show that our model achieves state-of-the-art performance on several representative datasets.
What problem does this paper attempt to address?