Abstract:Blind Image Quality Assessment (BIQA) is susceptible to poor transferability when the distribution shift occurs, e.g., from synthesis degradation to authentic degradation. To mitigate this, some studies have attempted to design unsupervised domain adaptation (UDA) based schemes for BIQA, which intends to eliminate the domain shift through adversarial-based feature alignment. However, the feature alignment is usually taken at the low-frequency space of features since the global average pooling operation. This ignores the transferable perception knowledge in other frequency components and causes the sub-optimal solution for the UDA of BIQA. To overcome this, from a novel frequency perspective, we propose an effective alignment strategy, i.e., Frequency Alignment (dubbed FreqAlign), to excavate the perception-oriented transferability of BIQA in the frequency space. Concretely, we study what frequency components of features are more proper for perception-oriented alignment. Based on this, we propose to improve the perception-oriented transferability of BIQA by performing feature frequency decomposition and selecting the frequency components that contained the most transferable perception knowledge for alignment. To achieve a stable and effective frequency selection, we further propose the frequency movement with a sliding window to find the optimal frequencies for alignment, which is composed of three strategies, i.e., warm up with pre-training, frequency movement-based selection, and perturbation-based finetuning. Extensive experiments under different domain adaptation settings of BIQA have validated the effectiveness of our proposed method. The code will be released at <a class="link-external link-https" href="https://github.com/lixinustc/Openworld-IQA" rel="external noopener nofollow">this https URL</a>.

Frequency Domain Correspondence For Speaker Normalization

A Bayesian Framework of Non-Synchronous Measurements at Coprime Positions for Sound Source Localization with High Resolution

Reference Point Alignment Frequency Warpmethod For Speaker Adaptation

Reference Point Alignment Frequency Warp Method for Speaker Adaptation

A Permutation Alignment Algorithm Based On Multiple Criterions In Audio Signal Frequency-Domain Blind Deconvolution

Supervisory Data Alignment for Text-Independent Voice Conversion

Weakly-supervised forced alignment of disfluent speech using phoneme-level modeling

FreqAlign: Excavating Perception-oriented Transferability for Blind Image Quality Assessment from A Frequency Perspective

Multiscale Point Correspondence Using Feature Distribution and Frequency Domain Alignment

Deep Normalization for Speaker Vectors

Mutual Alignment between Audiovisual Features for End-to-End Audiovisual Speech Recognition

A perceptually-motivated low-complexity instantaneous linear channel normalization technique applied to speaker verification

Variance Normalised Features for Language and Dialect Discrimination

A Spatial Long-Term Iterative Mask Estimation Approach for Multi-Channel Speaker Diarization and Speech Recognition.

Score domain speaking rate normalization for speaker recognition

Multi-Pitch Detection for Co-Channel Speech Utilizing Frequency Channel Piecewise Integration and Morphological Feedback Verification Tracking

Time-frequency Network for Robust Speaker Recognition

Research on Score Domain Speaking Rate Normalization for Speaker Recognition

The Mason-Alberta Phonetic Segmenter: A forced alignment system based on deep neural networks and interpolation

Generalized domain adaptation framework for parametric back-end in speaker recognition

Speaker Recognition Using DMFCC over Telephone Channels