METER: Multi-task efficient transformer for no-reference image quality assessment
Pengli Zhu,Siyuan Liu,Yancheng Liu,Pew-Thian Yap
DOI: https://doi.org/10.1007/s10489-023-05104-3
IF: 5.3
2023-11-07
Applied Intelligence
Abstract:No-reference image quality assessment (NR-IQA) is a fundamental yet challenging task in computer vision. Current NR-IQA methods based on convolutional neural networks typically employ deeply-stacked convolutions to learn local features pertinent to image quality, neglecting the importance of non-local information and distortion types. As a remedy, we introduce in this paper an end-to-end multi-task efficient transformer (METER) for the NR-IQA task, consisting of a multi-scale semantic feature extraction (MSFE) backbone module, a distortion type identification (DTI) module, and an adaptive quality prediction (AQP) module. METER identifies the distortion type using the DTI module to facilitate extraction of distortion-specific features via the MSFE module. METER scores image quality in an adaptive manner by adjusting the weights and biases of adaptive fully-connected (AFC) layers in the AQP module, increasing generalizability to images captured in different natural environments. Experimental results demonstrate that METER significantly outperforms existing methods for accuracy and efficiency across five public datasets: LIVEC, BID, KonIQ, LIVE, and CSIQ, and exhibits remarkable performance with Pearson's linear correlation coefficients: 0.923, 0.912, 0.937, 0.978, and 0.982 on respective datasets when compared to human subjective scores. Additionally, METER also attains higher efficiency (-53.9% Params and -87.7% FLOPs) compared to the existing transformer-based methods, making it valuable for real-world applications.
computer science, artificial intelligence