Structured Pattern Expansion with Diffusion Models

Marzia Riso,Giuseppe Vecchio,Fabio Pellacini
2024-11-13
Abstract:Recent advances in diffusion models have significantly improved the synthesis of materials, textures, and 3D shapes. By conditioning these models via text or images, users can guide the generation, reducing the time required to create digital assets. In this paper, we address the synthesis of structured, stationary patterns, where diffusion models are generally less reliable and, more importantly, less controllable. Our approach leverages the generative capabilities of diffusion models specifically adapted for the pattern domain. It enables users to exercise direct control over the synthesis by expanding a partially hand-drawn pattern into a larger design while preserving the structure and details of the input. To enhance pattern quality, we fine-tune an image-pretrained diffusion model on structured patterns using Low-Rank Adaptation (LoRA), apply a noise rolling technique to ensure tileability, and utilize a patch-based approach to facilitate the generation of large-scale assets. We demonstrate the effectiveness of our method through a comprehensive set of experiments, showing that it outperforms existing models in generating diverse, consistent patterns that respond directly to user input.
Computer Vision and Pattern Recognition,Graphics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to generate high - quality, tileable, and structurally consistent hand - drawn - style structured patterns**, especially when using Diffusion Models. These models usually perform poorly in generating structured patterns and lack precise control over the generation process. ### Specific description of the problem: 1. **Limitations of existing methods**: - Existing diffusion models perform well in generating natural images, but often fail to maintain the internal structure, sharpness, and visual consistency of patterns when generating structured patterns. - Design applications usually require users to have precise control over the generated patterns, and existing generation methods are insufficient in this regard, especially when dealing with hand - drawn - style structured patterns. - Although text or image conditional input can guide generation, the generation effect on structured patterns is unstable, and it is difficult to accurately capture and reproduce the details and structure of patterns. 2. **Research objectives**: - Propose a new method based on diffusion models, specifically for generating and expanding structured, static hand - drawn - style patterns. - By introducing Low - Rank Adaptation (LoRA) technology, fine - tune the pre - trained diffusion model so that it can better adapt to the task of generating structured patterns. - Combine the noise rolling technique and patch - based synthesis method to ensure that the generated patterns are of high quality, tileable, and structurally consistent. - Provide users with direct control over the generation process, so that the generated patterns can be faithful to user input and maintain consistency in structure and appearance. ### Key points of the solution: - **Fine - tune the pre - trained model**: Improve the performance of the model in generating structured patterns by fine - tuning the pre - trained diffusion model on the structured pattern data set. - **Noise rolling technique**: By periodically shifting the latent representation, ensure that the generated patterns maintain structural consistency and visual coherence during the expansion process. - **Patch - based synthesis**: Use the patch - based method to generate large - scale patterns, ensuring that the generated patterns can be seamlessly expanded on canvases of any size. - **User control**: Allow users to guide the generation process through partial hand - drawn input, ensuring that the generated patterns meet user expectations. Through these methods, the paper aims to fill the gap in structured pattern generation in existing generation models and provide a more controllable and high - quality generation tool.