TINQ: Temporal Inconsistency Guided Blind Video Quality Assessment

Yixiao Li,Xiaoyuan Yang,Weide Liu,Xin Jin,Xu Jia,Yukun Lai,Haotao Liu,Paul L Rosin,Wei Zhou
2024-12-25
Abstract:Blind video quality assessment (BVQA) has been actively researched for user-generated content (UGC) videos. Recently, super-resolution (SR) techniques have been widely applied in UGC. Therefore, an effective BVQA method for both UGC and SR scenarios is essential. Temporal inconsistency, referring to irregularities between consecutive frames, is relevant to video quality. Current BVQA approaches typically model temporal relationships in UGC videos using statistics of motion information, but inconsistencies remain unexplored. Additionally, different from temporal inconsistency in UGC videos, such inconsistency in SR videos is amplified due to upscaling algorithms. In this paper, we introduce the Temporal Inconsistency Guided Blind Video Quality Assessment (TINQ) metric, demonstrating that exploring temporal inconsistency is crucial for effective BVQA. Since temporal inconsistencies vary between UGC and SR videos, they are calculated in different ways. Based on this, a spatial module highlights inconsistent areas across consecutive frames at coarse and fine granularities. In addition, a temporal module aggregates features over time in two stages. The first stage employs a visual memory capacity block to adaptively segment the time dimension based on estimated complexity, while the second stage focuses on selecting key features. The stages work together through Consistency-aware Fusion Units to regress cross-time-scale video quality. Extensive experiments on UGC and SR video quality datasets show that our method outperforms existing state-of-the-art BVQA methods. Code is available at <a class="link-external link-https" href="https://github.com/Lighting-YXLI/TINQ" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Multimedia,Image and Video Processing
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the problems of quality assessment of user - generated content (UGC) and super - resolution (SR) videos in blind video quality assessment (BVQA). Specifically, the paper focuses on the following key points: 1. **Limitations of existing BVQA methods**: - Current BVQA methods mainly rely on statistical motion information to model temporal relationships, but ignore temporal inconsistency, that is, the irregularities or differences between consecutive frames. - For UGC videos, existing BVQA methods are unstable when dealing with super - resolution videos because super - resolution techniques magnify temporal inconsistency. 2. **Temporal inconsistency in super - resolution videos**: - Temporal inconsistency in super - resolution videos is different from that in UGC videos. Due to the influence of the up - sampling algorithm, this inconsistency is magnified, resulting in poor performance of existing BVQA methods on SR videos. 3. **Proposing a new evaluation metric**: - The paper introduces a blind video quality assessment method based on temporal inconsistency - TINQ (Temporal Inconsistency Guided Blind Video Quality Assessment). This method improves the accuracy of quality assessment of UGC and SR videos by exploring temporal inconsistency. 4. **Calculating temporal inconsistency**: - For UGC videos, temporal inconsistency is calculated through optical flow. - For SR videos, temporal inconsistency is calculated by the difference in optical flow between the SR video and its reference video. The formula is as follows: \[ VI = \begin{cases} \|\text{OF}(VR)-\text{OF}(VD)\|_2, & VD\in\text{SR video} \\ \|\text{OF}(VD)\|_2, & VD\in\text{UGC video} \end{cases} \] where \(\text{OF}(\cdot)\) represents optical flow calculation and \(\|\cdot\|_2\) represents the 2 - norm. 5. **Model structure**: - **Spatial module**: Captures temporal inconsistency in videos through coarse - grained and fine - grained spatial feature extractors (including Transformer and CNN), and highlights inconsistent areas through pixel - level weighting. - **Temporal module**: Through two - stage temporal aggregation, dynamically allocates memory thresholds based on the visual working memory mechanism to adapt to scene changes of different complexities, and finally regresses video quality scores across multiple time scales. 6. **Experimental verification**: - Extensive experiments were carried out on multiple UGC and SR video quality data sets, and the results show that the TINQ method outperforms the existing state - of - the - art BVQA methods on these data sets. In summary, by introducing temporal inconsistency as a guide, this paper proposes a new blind video quality assessment method TINQ, aiming to effectively assess the quality of UGC and SR videos and solve the deficiencies of existing methods in dealing with these two types of videos.