Visual Comfort Classification for Stereoscopic Videos Based on Two-Stream Recurrent Neural Network with Multi-level Attention.

Weize Gan,Danhong Peng,Yuzhen Niu
DOI: https://doi.org/10.1145/3561613.3561628
2022-01-01
Abstract:Due to the differences in visual systems between children and adults, a professional stereoscopic 3D video may not be comfortable for children. In this paper, we aim to answer whether a stereoscopic video is comfortable for children to watch by solving the visual comfort classification for stereoscopic videos. In particular, we propose a two-stream recurrent neural network (RNN) with multi-level attention for the visual comfort classification for stereoscopic videos. Firstly, we propose a two-stream RNN to extract and fuse spatial and temporal features from video frames and disparity maps. Furthermore, we propose using multi-level attention to effectively enhance the features in frame level, shot level, and finally video level. In addition, to our best knowledge, we establish the first high-definition stereoscopic 3D video dataset for performance evaluation. Experimental results show that our proposed model can effectively classify professional stereoscopic videos into visually comfortable for children or adults only.
What problem does this paper attempt to address?