360Spred: Saliency Prediction for 360-Degree Videos Based on 3D Separable Graph Convolutional Networks

Qin Yang,Wenxuan Gao,Chenglin Li,Hao Wang,Wenrui Dai,Junni Zou,Hongkai Xiong,Pascal Frossard
DOI: https://doi.org/10.1109/tcsvt.2024.3407685
IF: 5.859
2024-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Predicting the saliency map of a 360-degree video is the key for various downstream tasks, such as saliency-based compression and tile-based adaptive streaming. Besides static salient objects, the moving target will also contribute to the saliency map. Therefore, the joint exploitation of spherical spatio-temporal information is necessary for an accurate saliency prediction. The spherical spatial feature extraction, however, is hindered by the non-Euclidean geometric nature of spherical data, which imposes difficulty on direct extraction of the spatial features with traditional convolutional neural networks (CNNs). While the efficient exploitation of temporal correlation between these spherical spatial features remains another challenge, which requires the extraction of spherical optical flows for explicit motion information. To address these, in this paper, we first propose a spherical graph-based Farneback algorithm to extract the spherical optical flows directly in the sphere domain, by leveraging the GICOPix uniform sampling scheme. We then design a 3D separable graph convolutional network-based saliency prediction framework, named 360Spred, by taking both the spherical frames and spherical optical flows as input. The proposed 360Spred framework is based on the U-Net structure, with a 3D separable graph convolution (3DSGC) operator that directly extracts the visual and motion features in the sphere domain and exploits temporal correlation of both the high-level and low-level spatial features. Experimental results on two public datasets show that 360Spred can achieve a better performance than other baseline models in terms of the saliency prediction accuracy for 360-degree videos.
What problem does this paper attempt to address?