Abstract:Omnidirectional images provide an immersive viewing experience in a Virtual Reality (VR) environment, surpassing the limitations of traditional 2D media beyond the conventional screen. This VR technology allows users to interact with visual information in an exciting and engaging manner. However, the storage and transmission requirements for 360-degree panoramic images are substantial, leading to the establishment of compression frameworks. Unfortunately, these frameworks introduce projection distortion and compression artifacts. With the rapid growth of VR applications, it becomes crucial to investigate the quality of the perceptible omnidirectional experience and evaluate the extent of visual degradation caused by compression. In this regard, viewport plays a significant role in omnidirectional image quality assessment (OIQA), as it directly affects the user's perceived quality and overall viewing experience. Extracting viewports compatible with users viewing behavior plays a crucial role in OIQA. Different users may focus on different regions, and the model's performance may be sensitive to the chosen viewport extraction strategy. Improper selection of viewports could lead to biased quality predictions. Instead of assessing the entire image, attention can be directed to areas that are more importance to the overall quality. Feature extraction is vital in OIQA as it plays a significant role in representing image content that aligns with human perception. Taking this into consideration, the proposed ATtention enabled VIewport Selection (ATVIS-OIQA) employs attention based view port selection with Vision Transformers(ViT) for feature extraction. Furthermore, the spatial relationship between the viewports is established using graph convolution, enabling intuitive prediction of the objective visual quality of omnidirectional images. The effectiveness of the proposed model is demonstrated by achieving state-of-the-art results on publicly available benchmark datasets, namely OIQA and CVIQD.

Exploring Viewport Features for Semi-Supervised Saliency Prediction in Omnidirectional Images

Omnisupervised Omnidirectional Semantic Segmentation

Semi-Supervised Learning for Visual Bird's Eye View Semantic Segmentation

Semi-Supervised Learning for Visual Bird's Eye View Semantic Segmentation

Spatial Likelihood Voting with Self-Knowledge Distillation for Weakly Supervised Object Detection.

Self-supervised Visual-LiDAR Odometry with Flip Consistency

Saliency Prediction on Omnidirectional Image With Generative Adversarial Imitation Learning

Saliency Prediction on Omnidirectional Images with Generative Adversarial Imitation Learning

Saliency Prediction Network for $360^\circ$ Videos

Spherical Vision Transformer for 360-degree Video Saliency Prediction

SalNet360: Saliency Maps for omni-directional images with CNN

Viewport-Sphere-Branch Network for Blind Quality Assessment of Stitched 360° Omnidirectional Images

Saliency Prediction for Omnidirectional Images Considering Optimization on Sphere Domain

MRGAN360: Multi-stage Recurrent Generative Adversarial Network for 360 Degree Image Saliency Prediction

Predicting 360° Video Saliency: A ConvLSTM Encoder-Decoder Network with Spatio-temporal Consistency

Viewport-based CNN: A Multi-task Approach for Assessing 360° Video Quality

Multi-view contextual adaptation network for weakly supervised object detection in remote sensing images

Attention enabled viewport selection with graph convolution for omnidirectional visual quality assessment

Eye Scanpath Prediction-Based No-Reference Quality Assessment of Omnidirectional Images

360$^{\circ}$ Image Saliency Prediction by Embedding Self-Supervised Proxy Task

Spatial Attention-based Non-reference Perceptual Quality Prediction Network for Omnidirectional Images