Abstract:<p>Benefiting with the rapid development of communication networks, effective video quality assessment (VQA) models which provide guidance for video transmission and compression technologies are highly demanded. This paper proposes a general-purpose full-reference VQA method combining DenseNet with spatial pyramid pooling and RankNet to not only extract high-level distortion representation and global spatial information of samples but also characterize the temporal correlation among frames. Firstly, the pretrained DenseNet is modified and finetuned to extract high-level features of distorBenefiting with the rapid development of communication networks, effective video quality assessment (VQA) models which provide guidance for video transmission and compression technologies are highly demanded. This paper proposes a general-purpose full-reference VQA method combining DenseNet with spatial pyramid pooling and RankNet to not only extract high-level distortion representation and global spatial information of samples but also characterize the temporal correlation among frames. Firstly, the pretrained DenseNet is modified and finetuned to extract high-level features of distorted videos. Then, spatial pyramid pooling is equipped in the DenseNet module to process flexible inputs with arbitrary spatial resolution. Thus, this kind of input which has the same spatial resolution as the original distorted video is processed by the well-trained DenseNet to generate frame-level quality, which considers the global spatial information of videos directly. Finally, learning to rank is introduced to explore the high-level temporal correlation of distorted videos by taking the RankNet as the temporal pooling function. The experimental results on two public VQA databases show that the proposed algorithm performs consistently with human visual perception.</p>

Video Quality Assessment by Compact Representation of Energy in 3D-DCT Domain

Human Visual Perception Based Image Quality Assessment for Video Prediction

Blind Video Quality Prediction by Uncovering Human Video Perceptual Representation

An Efficient Quality Assessment Metric for 3D Video

Video Quality Assessment Based on Measuring Perceptual Noise from Spatial and Temporal Perspectives.

Novel Spatio-Temporal Structural Information Based Video Quality Metric

A Video Quality Assessment Metric Based on Human Visual System

Video Quality Assessment Metric Based On Spatio-Temporal Motion Information

Video Quality Assessment with Texture Information Fusion for Streaming Applications

Video Quality Assessment Based on LOG Filtering of Videos and Spatiotemporal Slice Images

Video Quality Assessment Based on Correlation Between Spatiotemporal Motion Energies

Capturing Co-existing Distortions in User-Generated Content for No-reference Video Quality Assessment

SDTV Quality Assessment Using Energy Distribution of DCT Coefficients

C3DVQA: Full-Reference Video Quality Assessment with 3D Convolutional Neural Network

A Gabor Feature-Based Full Reference Video Quality Assessment Model Based on Spatiotemporal Slice of Videos

A Method of Video Quality Assessment Based on the Sensitive Region.

XGC-VQA: A unified video quality assessment model for User, Professionally, and Occupationally-Generated Content

Visual Saliency and Distortion Weighting Based Video Quality Assessment

Video Quality Assessment: A Comprehensive Survey

Video quality assessment with dense features and ranking pooling

Free-Energy Principle Inspired Video Quality Metric and Its Use in Video Coding