No-Reference Video Quality Assessment Based on Ensemble of Knowledge and Data-Driven Models.

Li Su,Pamela C. Cosman,Qihang Peng
DOI: https://doi.org/10.1007/978-3-030-05716-9_19
2019-01-01
Abstract:No-reference (NR) video quality assessment (VQA) aims to evaluate video distortion in line with human visual perception without referring to the corresponding pristine signal. Many methods try to design models using prior knowledge of people’s experience. It is challenging due to the underlying complexity of video content, and the relatively limited understanding of the intricate mechanisms of the human visual system. Recently, some learning-based NR-VQA methods were proposed and regarded as data driven methods. However, in many practical scenarios, the labeled data is quite limited which significantly restricts the learning ability. In this paper, we first propose a data-driven model, V-CNN. It adaptively fits spatial and temporal distortion of time-varying video content. By using a shallow neural network, the spatial part runs faster than traditional models. The temporal part is more consistent with human subjective perception by introducing temporal SSIM jitter and hysteresis pooling. We then exploit the complementarity of V-CNN and a knowledge-driven model, VIIDEO. Compared to state-of-the-art full reference, reduced reference and no reference VQA methods, the proposed ensemble model shows a better balance between performance and efficiency with limited training data.
What problem does this paper attempt to address?