Block and Detail: Scaffolding Sketch-to-Image Generation

Vishnu Sarukkai,Lu Yuan,Mia Tang,Maneesh Agrawala,Kayvon Fatahalian
2024-10-26
Abstract:We introduce a novel sketch-to-image tool that aligns with the iterative refinement process of artists. Our tool lets users sketch blocking strokes to coarsely represent the placement and form of objects and detail strokes to refine their shape and silhouettes. We develop a two-pass algorithm for generating high-fidelity images from such sketches at any point in the iterative process. In the first pass we use a ControlNet to generate an image that strictly follows all the strokes (blocking and detail) and in the second pass we add variation by renoising regions surrounding blocking strokes. We also present a dataset generation scheme that, when used to train a ControlNet architecture, allows regions that do not contain strokes to be interpreted as not-yet-specified regions rather than empty space. We show that this partial-sketch-aware ControlNet can generate coherent elements from partial sketches that only contain a small number of strokes. The high-fidelity images produced by our approach serve as scaffolds that can help the user adjust the shape and proportions of objects or add additional elements to the composition. We demonstrate the effectiveness of our approach with a variety of examples and evaluative comparisons. Quantitatively, evaluative user feedback indicates that novice viewers prefer the quality of images from our algorithm over a baseline Scribble ControlNet for 84% of the pairs and found our images had less distortion in 81% of the pairs.
Graphics,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the deficiencies of existing sketch - to - image generation tools in supporting the interactive partial sketching workflow commonly used by artists. Specifically, these problems include: 1. **Insufficient support for proportion and shape exploration**: Existing methods either follow the blocking strokes in the sketch too strictly, resulting in unrealistic or improperly proportioned generated images; or follow the blocking strokes too little, causing the generated image to deviate significantly from the user's intention. For example, the images generated by ControlNet [35] may have unrealistic cat shapes, overly circular flowers, scooters with improper proportions, and simplified cupcake outlines (Figure 2 - ControlNet). The author hopes to develop a system that can loosely follow the blocking strokes and provide more reasonable and realistic alternative forms (Figure 2 - Our method). 2. **No support for partial sketches as input**: Existing methods regard areas without strokes as blank areas rather than unspecified areas. Therefore, the generated images often fail to provide examples of how to more completely specify the ongoing object or add other elements. This limits the effectiveness of these tools in helping users gradually perfect the final image. To solve these problems, the author proposes a new sketch - to - image generation tool, which aims to support artists in creating high - quality images through an iterative refinement process. This tool allows users to draw blocking strokes to roughly represent the position and shape of an object, and draw detail strokes to refine the outline and internal structure of the object. By generating high - fidelity images, these images can serve as scaffolds for users to adjust the shape and proportion of the object or add additional elements.