No-reference video quality assessment for user generated content based on deep network and visual perception

Yaya Tan,Guangqian Kong,Xun Duan,Yun Wu,Huiyun Long
DOI: https://doi.org/10.1117/1.JEI.30.5.053026
IF: 0.829
2021-01-01
Journal of Electronic Imaging
Abstract:Video quality assessment (VQA) is an important technique in video service systems. In recent years, the development of deep learning has provided further possibilities for VQA. A no-reference VQA (NR-VQA) method that combines the attention mechanism and human visual perception is proposed for in-the-wild videos. First, a deep network consisting of a convolutional neural network and attention mechanism is constructed to extract depth perception features for frame-level images, and global covariance pooling is integrated into the downsampled features to extract the second-order information of the features. Second, a Transformer network is used for temporal modeling to learn the long-term dependence of the perceptual quality prediction. Finally, a temporal weighting strategy for visual perception is used for weighted summation of the frame-level scores to obtain the final video quality scores. The results of experiments on three public user-generated content authentic distorted video databases, namely KoNViD-1k, CVD2014, and LIVE-VQC, demonstrate that the proposed method can achieve effective quality assessment in authentic distortion and outperforms other partially recent NR-VQA methods. (C) 2021 SPIE and IS&T
What problem does this paper attempt to address?