SpecRef: A Fast Training-free Baseline of Specific Reference-Condition Real Image Editing

Songyan Chen,Jiancheng Huang
2024-01-07
Abstract:Text-conditional image editing based on large diffusion generative model has attracted the attention of both the industry and the research community. Most existing methods are non-reference editing, with the user only able to provide a source image and text prompt. However, it restricts user's control over the characteristics of editing outcome. To increase user freedom, we propose a new task called Specific Reference Condition Real Image Editing, which allows user to provide a reference image to further control the outcome, such as replacing an object with a particular one. To accomplish this, we propose a fast baseline method named SpecRef. Specifically, we design a Specific Reference Attention Controller to incorporate features from the reference image, and adopt a mask mechanism to prevent interference between editing and non-editing regions. We evaluate SpecRef on typical editing tasks and show that it can achieve satisfactory performance. The source code is available on
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?