MeshArt: Generating Articulated Meshes with Structure-guided Transformers

Daoyi Gao,Yawar Siddiqui,Lei Li,Angela Dai
2024-12-16
Abstract:Articulated 3D object generation is fundamental for creating realistic, functional, and interactable virtual assets which are not simply static. We introduce MeshArt, a hierarchical transformer-based approach to generate articulated 3D meshes with clean, compact geometry, reminiscent of human-crafted 3D models. We approach articulated mesh generation in a part-by-part fashion across two stages. First, we generate a high-level articulation-aware object structure; then, based on this structural information, we synthesize each part's mesh faces. Key to our approach is modeling both articulation structures and part meshes as sequences of quantized triangle embeddings, leading to a unified hierarchical framework with transformers for autoregressive generation. Object part structures are first generated as their bounding primitives and articulation modes; a second transformer, guided by these articulation structures, then generates each part's mesh triangles. To ensure coherency among generated parts, we introduce structure-guided conditioning that also incorporates local part mesh connectivity. MeshArt shows significant improvements over state of the art, with 57.1% improvement in structure coverage and a 209-point improvement in mesh generation FID.
Computer Vision and Pattern Recognition,Graphics
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the problem of generating three - dimensional (3D) objects with dynamic and functional parts. Specifically, the paper proposes a new method named MeshArt for generating 3D mesh models with joint structures. These models not only have clear and compact geometries but can also simulate the functional movements of objects in the real world (such as the opening and closing of cabinet doors, the rotation of chairs, etc.). These problems have not been fully explored in existing 3D generation models. The main challenges lie in not only modeling the possible movements of functional parts but also generating clean and compact part geometries that respect the joint structures. #### Main challenges: 1. **Modeling the movement of functional parts**: It is necessary to generate 3D models that can simulate the functional movements of objects in the real world. 2. **Compactness and cleanliness of geometries**: Ensure that the generated 3D models have high - quality geometric details while maintaining a compact representation, similar to manually - designed 3D models. 3. **Limitations of datasets**: The current number of 3D datasets with part and joint annotations is limited, making it difficult to support data - driven learning. #### Solutions: To address these challenges, the paper proposes the following solutions: - **Hierarchical generation framework**: MeshArt adopts a hierarchical Transformer model. First, it generates a high - level joint - aware object structure, and then generates the mesh triangles of each part based on this structure. - **Unified triangle sequence prediction**: Model both the joint structure and part meshes as sequences of quantized triangle embeddings, and achieve a unified hierarchical framework through autoregressive generation. - **Structure - guided conditional mechanism**: Introduce a structure - guided conditional mechanism, combined with local geometric connections, to ensure smooth transitions between parts. - **Enhanced dataset**: Expand the PartNet dataset by adding joint annotations, increasing the number of movable parts in the dataset by more than six times. Through these methods, MeshArt significantly improves the coverage of joint structures and the quality of generated 3D meshes, increasing the FID score by 57.1% and 209 points respectively. ### Summary The main contribution of MeshArt lies in proposing a novel hierarchical method for generating 3D objects with joint structures, and through the structure - guided conditional mechanism and the enhanced dataset, achieving high - quality, compact, and functional 3D mesh generation.