PolyDiff: Generating 3D Polygonal Meshes with Diffusion Models

Antonio Alliegro,Yawar Siddiqui,Tatiana Tommasi,Matthias Nießner
2023-12-19
Abstract:We introduce PolyDiff, the first diffusion-based approach capable of directly generating realistic and diverse 3D polygonal meshes. In contrast to methods that use alternate 3D shape representations (e.g. implicit representations), our approach is a discrete denoising diffusion probabilistic model that operates natively on the polygonal mesh data structure. This enables learning of both the geometric properties of vertices and the topological characteristics of faces. Specifically, we treat meshes as quantized triangle soups, progressively corrupted with categorical noise in the forward diffusion phase. In the reverse diffusion phase, a transformer-based denoising network is trained to revert the noising process, restoring the original mesh structure. At inference, new meshes can be generated by applying this denoising network iteratively, starting with a completely noisy triangle soup. Consequently, our model is capable of producing high-quality 3D polygonal meshes, ready for integration into downstream 3D workflows. Our extensive experimental analysis shows that PolyDiff achieves a significant advantage (avg. FID and JSD improvement of 18.2 and 5.8 respectively) over current state-of-the-art methods.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of **efficiently generating high - quality 3D polygonal meshes**. Specifically, the authors propose a new method named **PolyDiff**, which uses diffusion models to directly generate realistic and diverse 3D polygonal meshes. #### Background and Challenges 1. **Limitations of Existing Methods**: - Most current 3D shape generation methods rely on alternative 3D representations (such as voxels, point clouds, distance fields, etc.). Although these representations are suitable for some learning methods, they often lead to quality loss when converted back to polygonal meshes, such as the lack of sharp edges and flat surfaces. - These methods usually need to convert the output into meshes through post - processing steps (such as marching cubes), which may result in over - subdivision or over - smoothing, and cannot achieve high - quality compact meshes made by experts manually. 2. **Complexity of Non - Euclidean Structures**: - Polygonal meshes have non - Euclidean structures and irregularities. Vertices are freely positioned in 3D space, and the size and number of faces are also different. This irregularity poses additional challenges to deep - learning methods because these methods are usually designed to handle regular, grid - based data structures (such as images). #### PolyDiff's Solutions 1. **Direct Generation of Polygonal Meshes**: - PolyDiff is the first diffusion model that can directly generate realistic and diverse 3D polygonal meshes. It represents the mesh as triangle soups in discrete coordinates and gradually corrodes these coordinates with classification noise in the forward diffusion process. - In the reverse diffusion process, a Transformer - based denoising network is trained to reverse the noise process and restore the original mesh structure. During inference, new meshes can be generated from completely noisy triangle soups by iteratively applying the denoising network. 2. **Improved Generation Quality and Diversity**: - Experimental results show that PolyDiff significantly outperforms the existing state - of - the - art methods in unconditional mesh generation tasks, with average FID and JSD improved by 18.2 and 5.8 respectively. 3. **Adaptation to Discrete Data Structures**: - PolyDiff utilizes discrete diffusion models, which perfectly adapt to the discrete characteristics of polygonal meshes, thereby generating cleaner and more diverse meshes. ### Summary PolyDiff proposes a brand - new method to directly generate high - quality 3D polygonal meshes through discrete diffusion models, solves the challenges faced by existing methods in generating high - quality meshes, and achieves significant advantages in multiple evaluation metrics.