RoCoTex: A Robust Method for Consistent Texture Synthesis with Diffusion Models

Jangyeong Kim,Donggoo Kang,Junyoung Choi,Jeonga Wi,Junho Gwon,Jiun Bae,Dumim Yoon,Junghyun Han
2024-09-30
Abstract:Text-to-texture generation has recently attracted increasing attention, but existing methods often suffer from the problems of view inconsistencies, apparent seams, and misalignment between textures and the underlying mesh. In this paper, we propose a robust text-to-texture method for generating consistent and seamless textures that are well aligned with the mesh. Our method leverages state-of-the-art 2D diffusion models, including SDXL and multiple ControlNets, to capture structural features and intricate details in the generated textures. The method also employs a symmetrical view synthesis strategy combined with regional prompts for enhancing view consistency. Additionally, it introduces novel texture blending and soft-inpainting techniques, which significantly reduce the seam regions. Extensive experiments demonstrate that our method outperforms existing state-of-the-art methods.
Computer Vision and Pattern Recognition,Graphics
What problem does this paper attempt to address?
The problems that this paper attempts to solve are the challenges existing in view consistency, alignment between texture and underlying mesh, and seamless synthesis in existing texture generation methods. Specifically: 1. **View Inconsistency**: Existing methods based on 2D diffusion models usually generate images of a single view at a time, resulting in texture inconsistency between different viewing angles (as shown in Figure 2 - (a)). This is known as the "Janus problem", that is, the front and back views do not match. 2. **Texture - Mesh Misalignment**: Due to the lack of 3D perception ability in 2D diffusion models, the generated textures often cannot be well - aligned to the mesh of 3D objects (as shown in Figure 2 - (b)). 3. **Obvious Seams and Artifacts**: In each iteration, the generated image is projected back onto the 3D mesh and integrated into the evolving texture through UV mapping, and this step usually generates unexpected artifacts and seams (as shown in Figure 2 - (c)). To solve these problems, this paper proposes a new method named RoCoTex, aiming to generate high - quality textures that are view - consistent, well - aligned, and seamless. RoCoTex achieves this goal through the following technical means: - **Symmetric View Synthesis Strategy**: Generate two symmetric views simultaneously to enhance view consistency and combine regional prompts to alleviate the Janus problem. - **SDXL and Multiple ControlNets**: Utilize Stable Diffusion XL (SDXL) and multiple ControlNets (depth, normal, and Canny edge) to capture structural features and details and improve the alignment between texture and geometry. - **Confidence - Based Texture Fusion**: Define the confidence of each pixel and fuse local textures into the global texture according to the confidence to reduce seams. - **Soft - Inpainting Based on Differential Diffusion**: Repair the mask by Gaussian blurring and combine the differential diffusion technique to make the repair of untextured areas more natural and reduce the seams between textured and untextured areas. These techniques work together to make RoCoTex perform excellently in generating high - quality, consistent, and seamless textures. Experimental results show that RoCoTex outperforms existing methods in multiple evaluation metrics.