Inferring Salient Objects from Human Fixations

Wenguan Wang,Jianbing Shen,Xingping Dong,Ali Borji,Ruigang Yang
DOI: https://doi.org/10.1109/tpami.2019.2905607
IF: 23.6
2020-01-01
IEEE Transactions on Pattern Analysis and Machine Intelligence
Abstract:Previous research in visual saliency has been focused on two major types of models namely fixation prediction and salient object detection. The relationship between the two, however, has been less explored. In this work, we propose to employ the former model type to identify salient objects. We build a novel Attentive Saliency Network (ASNet)11.Available at: https://github.com/wenguanwang/ASNet. that learns to detect salient objects from fixations. The fixation map, derived at the upper network layers, mimics human visual attention mechanisms and captures a high-level understanding of the scene from a global view. Salient object detection is then viewed as fine-grained object-level saliency segmentation and is progressively optimized with the guidance of the fixation map in a top-down manner. ASNet is based on a hierarchy of convLSTMs that offers an efficient recurrent mechanism to sequentially refine the saliency features over multiple steps. Several loss functions, derived from existing saliency evaluation metrics, are incorporated to further boost the performance. Extensive experiments on several challenging datasets show that our ASNet outperforms existing methods and is capable of generating accurate segmentation maps with the help of the computed fixation prior. Our work offers a deeper insight into the mechanisms of attention and narrows the gap between salient object detection and fixation prediction.
What problem does this paper attempt to address?