Deep3DSaliency: Deep Stereoscopic Video Saliency Detection Model by 3D Convolutional Networks.

Yuming Fang,Guanqun Ding,Jia Li,Zhijun Fang
DOI: https://doi.org/10.1109/tip.2018.2885229
IF: 10.6
2019-01-01
IEEE Transactions on Image Processing
Abstract:Stereoscopic saliency detection plays an important role in various stereoscopic video processing applications. However, conventional stereoscopic video saliency detection methods mainly use independent low-level features instead of extracting them automatically, and thus, they ignore the intrinsic relationship between the spatial and temporal information. In this paper, we propose a novel stereoscopic video saliency detection method based on 3D convolutional neural networks, namely Deep 3D Video Saliency (Deep3DSaliency). The proposed network consists of two sub-models: Spatiotemporal Saliency Model (STSM), and Stereoscopic Saliency Aware Model (SSAM). STSM directly takes three consecutive video frames as the input to extract visual spatiotemporal features, while SSAM attempts to further infer the depth and semantic features from the left and right video frames by shared parameters from STSM. The visual spatiotemporal features from STSM, and the depth and semantic features from SSAM are learned by an alternating optimization scheme. Finally, all these saliency-related features are combined together for the final stereoscopic saliency detection via 3D deconvolution. Experimental results show the superior performance of the proposed model over other existing ones in saliency estimation for 3D video sequences.
What problem does this paper attempt to address?