A No-Reference Quality Assessment Method for Digital Human Head

Yingjie Zhou,Zicheng Zhang,Wei Sun,Xiongkuo Min,Xianghe Ma,Guangtao Zhai
2023-10-26
Abstract:In recent years, digital humans have been widely applied in augmented/virtual reality (A/VR), where viewers are allowed to freely observe and interact with the volumetric content. However, the digital humans may be degraded with various distortions during the procedure of generation and transmission. Moreover, little effort has been put into the perceptual quality assessment of digital humans. Therefore, it is urgent to carry out objective quality assessment methods to tackle the challenge of digital human quality assessment (DHQA). In this paper, we develop a novel no-reference (NR) method based on Transformer to deal with DHQA in a multi-task manner. Specifically, the front 2D projections of the digital humans are rendered as inputs and the vision transformer (ViT) is employed for the feature extraction. Then we design a multi-task module to jointly classify the distortion types and predict the perceptual quality levels of digital humans. The experimental results show that the proposed method well correlates with the subjective ratings and outperforms the state-of-the-art quality assessment methods.
Computer Vision and Pattern Recognition,Image and Video Processing
What problem does this paper attempt to address?
The paper primarily focuses on the quality assessment of Digital Human Heads (DHH). Specifically, with the development of Augmented Reality (AR) and Virtual Reality (VR) technologies, the application of digital humans in these fields is becoming increasingly widespread. However, various distortions may be introduced during the generation and transmission of digital humans, affecting their visual quality. Currently, there is limited work on the perceptual quality assessment of digital humans, especially digital human heads. Therefore, there is an urgent need to develop objective quality assessment methods to address the challenges of Digital Human Quality Assessment (DHQA). The paper proposes a Transformer-based No-Reference (NR) multi-task learning method to solve the aforementioned problem. This method is implemented through the following steps: 1. **Projection Module**: First, the frontal 2D projection of the digital human head is taken as input. 2. **Feature Extraction Module**: The Vision Transformer (ViT) is used to extract features from the projection image. 3. **Multi-task Module**: A multi-task module is designed, which includes two sub-tasks: one for classifying the specific distortion type of the digital human head, and the other for predicting its perceptual quality level. Experimental results show that the proposed method is highly correlated with subjective scores and outperforms current state-of-the-art quality assessment methods in terms of prediction accuracy. Additionally, comparisons with other No-Reference Image Quality Assessment (NR IQA) methods and Full-Reference Point Cloud Quality Assessment (FR PCQA) methods demonstrate the effectiveness and superiority of the proposed method. Finally, the paper conducts ablation experiments to verify the effectiveness of multi-task learning and compares the performance of different backbone networks.