Abstract:We introduce PolyDiff, the first diffusion-based approach capable of directly generating realistic and diverse 3D polygonal meshes. In contrast to methods that use alternate 3D shape representations (e.g. implicit representations), our approach is a discrete denoising diffusion probabilistic model that operates natively on the polygonal mesh data structure. This enables learning of both the geometric properties of vertices and the topological characteristics of faces. Specifically, we treat meshes as quantized triangle soups, progressively corrupted with categorical noise in the forward diffusion phase. In the reverse diffusion phase, a transformer-based denoising network is trained to revert the noising process, restoring the original mesh structure. At inference, new meshes can be generated by applying this denoising network iteratively, starting with a completely noisy triangle soup. Consequently, our model is capable of producing high-quality 3D polygonal meshes, ready for integration into downstream 3D workflows. Our extensive experimental analysis shows that PolyDiff achieves a significant advantage (avg. FID and JSD improvement of 18.2 and 5.8 respectively) over current state-of-the-art methods.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of **efficiently generating high - quality 3D polygonal meshes**. Specifically, the authors propose a new method named **PolyDiff**, which uses diffusion models to directly generate realistic and diverse 3D polygonal meshes. #### Background and Challenges 1. **Limitations of Existing Methods**: - Most current 3D shape generation methods rely on alternative 3D representations (such as voxels, point clouds, distance fields, etc.). Although these representations are suitable for some learning methods, they often lead to quality loss when converted back to polygonal meshes, such as the lack of sharp edges and flat surfaces. - These methods usually need to convert the output into meshes through post - processing steps (such as marching cubes), which may result in over - subdivision or over - smoothing, and cannot achieve high - quality compact meshes made by experts manually. 2. **Complexity of Non - Euclidean Structures**: - Polygonal meshes have non - Euclidean structures and irregularities. Vertices are freely positioned in 3D space, and the size and number of faces are also different. This irregularity poses additional challenges to deep - learning methods because these methods are usually designed to handle regular, grid - based data structures (such as images). #### PolyDiff's Solutions 1. **Direct Generation of Polygonal Meshes**: - PolyDiff is the first diffusion model that can directly generate realistic and diverse 3D polygonal meshes. It represents the mesh as triangle soups in discrete coordinates and gradually corrodes these coordinates with classification noise in the forward diffusion process. - In the reverse diffusion process, a Transformer - based denoising network is trained to reverse the noise process and restore the original mesh structure. During inference, new meshes can be generated from completely noisy triangle soups by iteratively applying the denoising network. 2. **Improved Generation Quality and Diversity**: - Experimental results show that PolyDiff significantly outperforms the existing state - of - the - art methods in unconditional mesh generation tasks, with average FID and JSD improved by 18.2 and 5.8 respectively. 3. **Adaptation to Discrete Data Structures**: - PolyDiff utilizes discrete diffusion models, which perfectly adapt to the discrete characteristics of polygonal meshes, thereby generating cleaner and more diverse meshes. ### Summary PolyDiff proposes a brand - new method to directly generate high - quality 3D polygonal meshes through discrete diffusion models, solves the challenges faced by existing methods in generating high - quality meshes, and achieves significant advantages in multiple evaluation metrics.

PolyDiff: Generating 3D Polygonal Meshes with Diffusion Models

PolyDiffuse: Polygonal Shape Reconstruction via Guided Set Diffusion Models

TetraDiffusion: Tetrahedral Diffusion Models for 3D Shape Generation

MeshDiffusion: Score-based Generative 3D Mesh Modeling

Consistent Mesh Diffusion

DMESH: A Structure-Preserving Diffusion Model for 3-D Mesh Denoising

Deformable 3D Shape Diffusion Model

OctFusion: Octree-based Diffusion Models for 3D Shape Generation

Diffusion 3D Features (Diff3F): Decorating Untextured Shapes with Distilled Semantic Features

Neural Point Cloud Diffusion for Disentangled 3D Shape and Appearance Generation

Mixed Diffusion for 3D Indoor Scene Synthesis

DiffuScene: Denoising Diffusion Models for Generative Indoor Scene Synthesis

DMesh++: An Efficient Differentiable Mesh for Complex Shapes

RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation

Diff-pcg: diffusion point cloud generation conditioned on continuous normalizing flow

A Least Squares Based Diamond Scheme for 3D Heterogeneous and Anisotropic Diffusion Problems on Polyhedral Meshes

Generic 3D Diffusion Adapter Using Controlled Multi-View Editing

Geometric-Facilitated Denoising Diffusion Model for 3D Molecule Generation

An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion

HexaGen3D: StableDiffusion is just one step away from Fast and Diverse Text-to-3D Generation

Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation