Abstract:Many popular video quality assessment (VQA) methods usually build models by simulating the process of human visual perception and adopt a simple regression strategy to predict video quality scores. However, these methods either hardly pay enough attention to regression processing prone to misprediction, or fail to accurately understand video content containing changes of movement or sudden movements. To remedy these, we propose a full reference (FR) video quality assessment model that integrates multi-task learning regression and analysis of spatio-temporal features to conduct video quality predictions. Firstly, the model arranges each frame of the reference and distorted videos into patches and calculates their entropy values to guide the selection of frame patches. A 2D Siamese network is then applied on the selected patches to learn spatial information. To more effectively capture temporal distortions, a multi-frame difference map is computed on each distorted video. The computed multi-frame difference maps are also partitioned into patches to select half of the ones with highest entropy values as temporal features. Additionally, we incorporate the temporal masking effect to optimize the spatial error and temporal features and adopt 3D convolutional neural network (CNN) in spatio-temporal feature fusion. Following recent evidence towards quality classification and quality regression, a constrained multi-task learning regression model is designed to aggregate the quality score, using quality classification subtask to contrain and optimize quality regression main task. Finally, the video quality score is predicted through the regression branch. We have evaluated our algorithm on five public VQA databases. The experimental results have revealed that the proposed algorithm can achieve superior performance as compared with the existing VQA methods.

RankDVQA-mini: Knowledge Distillation-Driven Deep Video Quality Assessment

RankDVQA: Deep VQA based on Ranking-inspired Hybrid Training

KonVid-150k: A Dataset for No-Reference Video Quality Assessment of Videos in-the-Wild

Deep Video Quality Assessment Using Constrained Multi-Task Regression and Spatio-temporal Feature Fusion.

Ada-DQA: Adaptive Diverse Quality-aware Feature Acquisition for Video Quality Assessment

Video quality assessment with dense features and ranking pooling

Deep Learning Based Full-reference and No-reference Quality Assessment Models for Compressed UGC Videos

Deep Quality Assessment of Compressed Videos: A Subjective and Objective Study

MV-VVQA: Multi-View Learning for No-Reference Volumetric Video Quality Assessment

Highly Efficient No-reference 4K Video Quality Assessment with Full-Pixel Covering Sampling and Training Strategy

Deep Local and Global Spatiotemporal Feature Aggregation for Blind Video Quality Assessment

Predicting the Quality of Compressed Videos With Pre-Existing Distortions

Video Quality Assessment: A Comprehensive Survey

Convolutional Neural Networks for Video Quality Assessment

Analysis of Video Quality Datasets via Design of Minimalistic Video Quality Models

Deep Tiny Network for Recognition-Oriented Face Image Quality Assessment

ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment

Capturing Co-existing Distortions in User-Generated Content for No-reference Video Quality Assessment

UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated Content

Unified Quality Assessment of in-the-Wild Videos with Mixed Datasets Training

No-Reference Nonuniform Distorted Video Quality Assessment Based on Deep Multiple Instance Learning