A Blind Video Quality Assessment Method Via Spatiotemporal Pyramid Attention
Wenhao Shen,Mingliang Zhou,Xuekai Wei,Heqiang Wang,Bin Fang,Cheng Ji,Xu Zhuang,Jason Wang,Jun Luo,Huayan Pu,Xiaoxu Huang,Shilong Wang,Huajun Cao,Yong Feng,Tao Xiang,Zhaowei Shang
DOI: https://doi.org/10.1109/tbc.2023.3340031
IF: 4.5
2024-01-01
IEEE Transactions on Broadcasting
Abstract:As social media communication develops, reliable multimedia quality evaluation indicators have become a prerequisite for enriching user experience services. In this paper, we propose a multiscale spatiotemporal pyramid attention (SPA) block for constructing a blind video quality assessment (VQA) method to evaluate the perceptual quality of videos. First, we extract motion information from the video frames at different temporal scales to form a feature pyramid, which provides a feature representation with multiple visual perceptions. Second, an SPA module, which can effectively extract multiscale spatiotemporal information at various temporal scales and develop a cross-scale dependency relationship, is proposed. Finally, the quality estimation process is completed by passing the extracted features obtained from a network of multiple stacked spatiotemporal pyramid blocks through a regression network to determine the perceived quality. The experimental results demonstrate that our method is on par with the state-of-the-art approaches. The source code necessary for conducting groundbreaking scientific research is accessible online https://github.com/Land5cape/SPBVQA.