Viewport-Dependent Saliency Prediction in 360 Video

Minglang Qiao,Mai Xu,Zulin Wang,Ali Borji
DOI: https://doi.org/10.1109/TMM.2020.2987682
IF: 7.3
2021-01-01
IEEE Transactions on Multimedia
Abstract:Saliency prediction in traditional images and videos has drawn extensive research interests in recent years. Few works have been proposed for saliency prediction over 360 degrees videos. They focus on directly predicting fixations over the whole panorama. When viewing 360 degrees videos, a person can only observe the content in her viewport, which means that only a fraction of the 360 degrees scene can be seen at any given time. In this paper, we study human attention over viewport of 360 degrees videos and propose a novel visual saliency model, dubbed viewport saliency, to predict fixations over 360 degrees videos. Two contributions are introduced. First, we find that where people look is affected by the content and location of the viewport in 360 degrees video. We study this over 200+ 360 degrees videos viewed by 30+ subjects over two recent benchmark databases. Second, we propose a Multi-Task Deep Neural Network (MT-DNN) method for Viewport Saliency (VS) prediction in 360 degrees video, which considers the input content and location of the viewport. Extensive experiments and analyses show that our method outperforms other state-of-the-art methods in this task. In particular, over the two recent 360 degrees video databases, our MT-DNN raises the average CC score by 0.149 and 0.205, compared to SalGAN and DeepVS methods, respectively.
What problem does this paper attempt to address?