Salient Object Detection Using Cascaded Convolutional Neural Networks and Adversarial Learning

Youbao Tang,Xiangqian Wu
DOI: https://doi.org/10.1109/tmm.2019.2900908
IF: 7.3
2019-09-01
IEEE Transactions on Multimedia
Abstract:Salient object detection has received much attention and achieved great success in last several years. It is still challenging to get clear boundaries and consistent saliencies, which can be considered as the structural information of salient objects. A popular solution is to conduct some post-processes (e.g., conditional random field (CRF)) to refine these structural information. In this paper, a novel cascaded convolutional neural networks (CNNs) based method is proposed to implicitly learn these structural information via adversarial learning for salient object detection (we termed the proposed method as CCAL). A cascaded CNNs model is first designed as a generator $\boldsymbol {G}$, which consists of an encoderdecoder network for global saliency estimation and a deep residual network for local saliency refinement. It is hard to explicitly learn such structural information due to the limitation of frequently-used pixel-wise loss functions. Instead, a discriminator $\boldsymbol {D}$ is then designed to distinguish the real salient maps (i.e., ground truths) from the fake ones produced by $\boldsymbol {G}$, based on which an adversarial loss is introduced to optimize $\boldsymbol {G}$. $\boldsymbol {G}$ and $\boldsymbol {D}$ are trained in a fully end-to-end fashion by following the strategy of conditional generative adversarial networks to make $\boldsymbol {G}$ well learn the structural information. At last, $\boldsymbol {G}$ is able to produce high quality salient maps without requiring any post-process to fool $\boldsymbol {D}$. Experimental results on eight benchmark datasets demonstrate the effectiveness and efficiency (about 17 fps on graphics processing unit (GPU)) of the proposed method for salient object detection.
What problem does this paper attempt to address?