A Two-Fold Patch Selection Approach for Improved 360-Degree Image Quality Assessment

Abderrezzaq Sendjasni,Seif-Eddine Benkabou,Mohamed-Chaker Larabi
2024-12-17
Abstract:This article presents a novel approach to improving the accuracy of 360-degree perceptual image quality assessment (IQA) through a two-fold patch selection process. Our methodology combines visual patch selection with embedding similarity-based refinement. The first stage focuses on selecting patches from 360-degree images using three distinct sampling methods to ensure comprehensive coverage of visual content for IQA. The second stage, which is the core of our approach, employs an embedding similarity-based selection process to filter and prioritize the most informative patches based on their embeddings similarity distances. This dual selection mechanism ensures that the training data is both relevant and informative, enhancing the model's learning efficiency. Extensive experiments and statistical analyses using three distance metrics across three benchmark datasets validate the effectiveness of our selection algorithm. The results highlight its potential to deliver robust and accurate 360-degree IQA, with performance gains of up to 4.5% in accuracy and monotonicity of quality score prediction, while using only 40% to 50% of the training patches. These improvements are consistent across various configurations and evaluation metrics, demonstrating the strength of the proposed method. The code for the selection process is available at: <a class="link-external link-https" href="https://github.com/sendjasni/patch-selection-360-image-quality" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve several key challenges in 360 - degree panoramic image quality assessment (IQA), specifically including: 1. **Complex distortion and non - uniform sampling**: Due to its spherical nature, 360 - degree images are prone to complex distortion and non - uniform sampling, resulting in large differences in perceived quality in different regions. 2. **Impact of immersive and interactive applications**: Applications using 360 - degree content are immersive and interactive, and users' viewing conditions, expectations, and visual exploration patterns also affect the perceived quality. 3. **Limitations of existing methods**: Current deep - learning models rely on large - scale labeled datasets for training when evaluating 360 - degree image quality, which is not only costly but also time - consuming. In addition, these models are highly sensitive to the quality and diversity of training data, which may lead to poor performance in practical applications. To solve these problems, the author proposes a new two - stage patch selection method to improve the accuracy and robustness of 360 - degree image quality assessment. Specifically, this method improves the existing technology in the following ways: - **First stage: Multi - strategy visual patch selection**: - Use three different sampling methods (projection method, latitude importance method, and multi - view trajectory method) to select patches from 360 - degree images to ensure comprehensive visual content coverage. - **Second stage: Embedding similarity - optimized selection**: - Further screen and prioritize the most informative patches based on the embedding similarity distance to ensure the relevance and informativeness of the training data, thereby enhancing the learning efficiency of the model. Through this method, the author hopes to significantly improve the accuracy and monotonicity of 360 - degree image quality assessment while reducing the number of training samples and verify its effectiveness on multiple benchmark datasets. ### Formula summary The formulas involved in the article are mainly used to describe the optimization problems in the embedding similarity selection process, for example: - Objective function minimization problem: \[ \min_{W_i} \| (E_i W_i) (E_i W_i)^\top - S_i \|_F^2 + \alpha \| W_i \|_{2,1} \] where: - \( S_i \) is the similarity matrix calculated in the embedding space \( E_i \). - \( W_i \in \mathbb{R}^{d_i \times h} \) is the transformation matrix. - \( \alpha \| W_i \|_{2,1} \) is the regularization term used to promote the sparsity of coefficients. - Optimization problem introduced by residual analysis: \[ \min_{W_i, R_i} \| E_i W_i - Z_i - R_i^\top \|_F^2 + \alpha \| W_i \|_{2,1} + \beta \| R_i \|_{2,1} \] where: - \( R_i \) is the residual matrix used to identify irrelevant patches. - \( \beta \) is the regularization hyperparameter used to control the number of patches selected through \( R_i \). These formulas ensure the high quality and relevance of the selected patches, thereby improving the accuracy of 360 - degree image quality assessment.