Abstract:Image quality assessment is a fundamental problem in the field of image processing, and due to the lack of reference images in most practical scenarios, no-reference image quality assessment (NR-IQA), has gained increasing attention recently. With the development of deep learning technology, many deep neural network-based NR-IQA methods have been developed, which try to learn the image quality based on the understanding of database information. Currently, Transformer has achieved remarkable progress in various vision tasks. Since the characteristics of the attention mechanism in Transformer fit the global perceptual impact of artifacts perceived by a human, Transformer is thus well suited for image quality assessment tasks. In this paper, we propose a Transformer based NR-IQA model using a predicted objective error map and perceptual quality token. Specifically, we firstly generate the predicted error map by pre-training one model consisting of a Transformer encoder and decoder, in which the objective difference between the distorted and the reference images is used as supervision. Then, we freeze the parameters of the pre-trained model and design another branch using the vision Transformer to extract the perceptual quality token for feature fusion with the predicted error map. Finally, the fused features are regressed to the final image quality score. Extensive experiments have shown that our proposed method outperforms the current state-of-the-art in both authentic and synthetic image databases. Moreover, the attentional map extracted by the perceptual quality token also does conform to the characteristics of the human visual system.

Transformer for Image Quality Assessment

Image Quality Assessment with Transformers and Multi-Metric Fusion Modules

Blind Image Quality Assessment via Transformer Predicted Error Map and Perceptual Quality Token

MSTRIQ: No Reference Image Quality Assessment Based on Swin Transformer with Multi-Stage Fusion

ARET-IQA: an Aspect-Ratio-Embedded Transformer for Image Quality Assessment

Visual Mechanisms Inspired Efficient Transformers for Image and Video Quality Assessment

Auxiliary Information Guided Self-Attention for Image Quality Assessment

VTAMIQ: Transformers for Attention Modulated Image Quality Assessment

Assessing Face Image Quality: A Large-scale Database and a Transformer Method

RIFormer: Learning Rotation-Invariant Features Via Transformer

Local Distortion Aware Efficient Transformer Adaptation for Image Quality Assessment

Boosting Image Quality Assessment Through Efficient Transformer Adaptation with Local Feature Enhancement

Perception-Oriented U-Shaped Transformer Network for 360-Degree No-Reference Image Quality Assessment

Pure Transformer with Integrated Experts for Scene Text Recognition

Multi-Layer Visual Perception for No-Reference Image Quality Assessment

METER: Multi-task efficient transformer for no-reference image quality assessment

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Multi-Scale Features and Parallel Transformers Based Image Quality Assessment

EViTIB: Efficient Vision Transformer Via Inductive Bias Exploration for Image Super-Resolution

A Survey on Visual Transformer

A Survey on Vision Transformer