Abstract:The universality of the point cloud format enables many 3D applications, making the compression of point clouds a critical phase in practice. Sampled as discrete 3D points, a point cloud approximates 2D surface(s) embedded in 3D with a finite bit-depth. However, the point distribution of a practical point cloud changes drastically as its bit-depth increases, requiring different methodologies for effective consumption/analysis. In this regard, a heterogeneous point cloud compression (PCC) framework is proposed. We unify typical point cloud representations -- point-based, voxel-based, and tree-based representations -- and their associated backbones under a learning-based framework to compress an input point cloud at different bit-depth levels. Having recognized the importance of voxel-domain processing, we augment the framework with a proposed context-aware upsampling for decoding and an enhanced voxel transformer for feature aggregation. Extensive experimentation demonstrates the state-of-the-art performance of our proposal on a wide range of point clouds.

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to address key issues in Point Cloud Compression (PCC), particularly the efficient compression of point cloud data at different bit depths. Point clouds are a data format used to describe the surfaces of 3D objects or scenes, widely used in AR/VR, robotics, autonomous driving, and other fields. However, as the precision of point cloud data (represented by bit depth) increases, the changes in point distribution require different processing methods to effectively compress and analyze this data. Specifically, the paper proposes a new heterogeneous point cloud compression framework—PIVOT-Net (PoInt, VOxel and Tree), which unifies point-based, voxel-based, and tree-based point cloud representation methods and combines deep learning techniques to meet the compression needs of point clouds at different bit depths. In this way, PIVOT-Net can efficiently compress point cloud data at various bit depth levels, providing better performance in terms of transmission and storage. ### Main Contributions 1. **Unified Point Cloud Compression Framework**: PIVOT-Net is the first point cloud compression method that unifies point-based, voxel-based, and tree-based representation methods within a single learning framework, capable of efficiently compressing point clouds at different bit depth levels. 2. **Enhanced Voxel Domain Processing**: PIVOT-Net improves voxel domain processing capabilities by introducing a context-aware upsampling module and an enhanced voxel transformer module, enhancing the effectiveness of feature aggregation. 3. **State-of-the-Art Compression Performance**: Experimental results show that PIVOT-Net exhibits state-of-the-art compression performance on a wide range of practical point cloud datasets. ### Method Overview The main architecture of PIVOT-Net includes the following components: - **Point Analysis Network**: Used to generate a coarse representation of the point cloud and extract geometric features of each point. - **Voxel Analysis Network**: Utilizes sparse convolutional neural networks (CNN) to downsample the point cloud and aggregate features. - **Feature Analysis Network**: Further downsamples the feature map, reducing the resolution of the feature map. - **Tree-based Encoder**: Losslessly encodes the coarse partition information of the point cloud. - **Decoder**: Includes a feature synthesis network, voxel synthesis network, and point synthesis network, used to recover the original point cloud from the compressed data. Through the collaborative work of these components, PIVOT-Net can effectively handle point cloud data at different bit depth levels, achieving efficient compression and decompression.

PIVOT-Net: Heterogeneous Point-Voxel-Tree-based Framework for Point Cloud Compression

3QNet: 3D Point Cloud Geometry Quantization Compression Network

3QNet

VoxelContext-Net: An Octree based Framework for Point Cloud Compression

Point Cloud Compression with Implicit Neural Representations: A Unified Framework

PCHM-Net: A New Point Cloud Compression Framework for Both Human Vision and Machine Vision.

Towards Point Cloud Compression for Machine Perception: A Simple and Strong Baseline by Learning the Octree Depth Level Predictor

A Hybrid Compression Framework for Color Attributes of Static 3D Point Clouds

Transformer and Upsampling-Based Point Cloud Compression

PVContext: Hybrid Context Model for Point Cloud Compression

Multiscale Point Cloud Geometry Compression

Optimized Octree Codec for Geometry-Based Point Cloud Compression

DeepCompress: Efficient Point Cloud Geometry Compression

A Coarse-to-Fine Framework for Point Voxel Transformer

Point Cloud Geometry Compression Based on Multi-Layer Residual Structure

Complement decoded point cloud with coordinate adjustment for video-based point cloud compression

Implicit Neural Compression of Point Clouds

Embedded Coding of Point Cloud Attributes

Fast Point Cloud Geometry Compression with Context-based Residual Coding and INR-based Refinement

A point cloud compression framework via spherical projection

Near-lossless Point Cloud Geometry Compression Based on Adaptive Residual Compensation