Abstract:No-Reference Point Cloud Quality Assessment (NR-PCQA) aims to objectively assess the human perceptual quality of point clouds without relying on pristine-quality point clouds for reference. It is becoming increasingly significant with the rapid advancement of immersive media applications such as virtual reality (VR) and augmented reality (AR). However, current NR-PCQA models attempt to indiscriminately learn point cloud content and distortion representations within a single network, overlooking their distinct contributions to quality information. To address this issue, we propose DisPA, a novel disentangled representation learning framework for NR-PCQA. The framework trains a dual-branch disentanglement network to minimize mutual information (MI) between representations of point cloud content and distortion. Specifically, to fully disentangle representations, the two branches adopt different philosophies: the content-aware encoder is pretrained by a masked auto-encoding strategy, which can allow the encoder to capture semantic information from rendered images of distorted point clouds; the distortion-aware encoder takes a mini-patch map as input, which forces the encoder to focus on low-level distortion patterns. Furthermore, we utilize an MI estimator to estimate the tight upper bound of the actual MI and further minimize it to achieve explicit representation disentanglement. Extensive experimental results demonstrate that DisPA outperforms state-of-the-art methods on multiple PCQA datasets.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the representation learning problem in No - Reference Point Cloud Quality Assessment (NR - PCQA). Specifically, the goal of NR - PCQA is to objectively evaluate the perceptual quality of point clouds without the original high - quality point cloud as a reference. With the rapid development of immersive media applications (such as virtual reality and augmented reality), this problem is becoming increasingly important. However, existing NR - PCQA models have the following problems when learning point cloud content and distortion representations: 1. **Single network structure**: Most existing methods attempt to learn the representations of point cloud content and distortion simultaneously in a single network, ignoring their different contributions to quality information. 2. **Feature entanglement**: Since point cloud content and distortion patterns are highly entangled in the representation space, the model performance is limited. 3. **Data imbalance**: The high - dimensional nature of point cloud content makes it very difficult to learn its representation, and existing PCQA datasets are very limited in terms of content, which is prone to overfitting. To solve these problems, the authors propose a new decoupled representation learning framework called DisPA (Disentangled Perceptual Assessment), which separates the representations of point cloud content and distortion by minimizing mutual information (MI). The main contributions of DisPA include: - **Two - branch structure**: Two independent encoders are used to learn the representations of point cloud content and distortion respectively. - **Pre - training strategy**: The content - aware encoder is pre - trained by a masked auto - encoding strategy to capture semantic information. - **Local distortion map generation**: Local distortion maps are generated by mesh sampling, forcing the distortion - aware encoder to focus on low - level distortion patterns. - **MI regularization**: A mutual information estimator is used to estimate and minimize the mutual information between content and distortion representations, achieving explicit representation decoupling. Through these methods, DisPA can achieve better performance than existing methods on multiple PCQA datasets and better follow the perceptual mechanism of the human visual system.

Learning Disentangled Representations for Perceptual Point Cloud Quality Assessment via Mutual Information Minimization

PAME: Self-Supervised Masked Autoencoder for No-Reference Point Cloud Quality Assessment

MM-PCQA: Multi-Modal Learning for No-reference Point Cloud Quality Assessment

Contrastive Pre-Training with Multi-View Fusion for No-Reference Point Cloud Quality Assessment

Once-Training-All-Fine: No-Reference Point Cloud Quality Assessment via Domain-relevance Degradation Description

Zoom to Perceive Better: No-reference Point Cloud Quality Assessment via Exploring Effective Multiscale Feature

PQA-Net: Deep No Reference Point Cloud Quality Assessment via Multi-View Projection

Perception-Guided Quality Metric of 3D Point Clouds Using Hybrid Strategy

Perceptual Point Cloud Qality Assessment for Immersive Metaverse Experience

Hallucinated-PQA: No reference point cloud quality assessment via injecting pseudo-reference features

No-Reference Point Cloud Quality Assessment via Domain Adaptation

PointPCA: Point Cloud Objective Quality Assessment Using PCA-Based Descriptors

Simple Baselines for Projection-based Full-reference and No-reference Point Cloud Quality Assessment

TCDM: Transformational Complexity Based Distortion Metric for Perceptual Point Cloud Quality Assessment

Uncertainty-aware No-Reference Point Cloud Quality Assessment

Reduced Reference Quality Assessment for Point Cloud Compression

Deep Learning-Based Quality Assessment Of 3d Point Clouds Without Reference

Activating Frequency and ViT for 3D Point Cloud Quality Assessment without Reference

No-reference Point Cloud Geometry Quality Assessment Based on Pairwise Rank Learning