Abstract:Subject-driven image generation aims at generating images containing customized subjects, which has recently drawn enormous attention from the research community. However, the previous works cannot precisely control the background and position of the target subject. In this work, we aspire to fill the void and propose two novel subject-driven sub-tasks, i.e., Subject Replacement and Subject Addition. The new tasks are challenging in multiple aspects: replacing a subject with a customized one can change its shape, texture, and color, while adding a target subject to a designated position in a provided scene necessitates a context-aware posture. To conquer these two novel tasks, we first manually curate a new dataset DreamEditBench containing 22 different types of subjects, and 440 source images with different difficulty levels. We plan to host DreamEditBench as a platform and hire trained evaluators for standard human evaluation. We also devise an innovative method DreamEditor to resolve these tasks by performing iterative generation, which enables a smooth adaptation to the customized subject. In this project, we conduct automatic and human evaluations to understand the performance of DreamEditor and baselines on DreamEditBench. For Subject Replacement, we found that the existing models are sensitive to the shape and color of the original subject. The model failure rate will dramatically increase when the source and target subjects are highly different. For Subject Addition, we found that the existing models cannot easily blend the customized subjects into the background smoothly, leading to noticeable artifacts in the generated image. We hope DreamEditBench can become a standard platform to enable future investigations toward building more controllable subject-driven image editing. Our project homepage is <a class="link-external link-https" href="https://dreameditbenchteam.github.io/" rel="external noopener nofollow">this https URL</a>.

DreamBooth++: Boosting Subject-Driven Generation Via Region-Level References Packing

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

AttnDreamBooth: Towards Text-Aligned Personalized Text-to-Image Generation

HybridBooth: Hybrid Prompt Inversion for Efficient Subject-Driven Generation

DreamBooth3D: Subject-Driven Text-to-3D Generation

DreamTuner: Single Image is Enough for Subject-Driven Generation

DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image Diffusion Models

HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models

DreamArtist: Towards Controllable One-Shot Text-to-Image Generation via Positive-Negative Prompt-Tuning

Subject-driven Text-to-Image Generation via Apprenticeship Learning

InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning

VideoBooth: Diffusion-based Video Generation with Image Prompts

Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance

Subject-driven Text-to-Image Generation via Preference-based Reinforcement Learning

A New Chinese Landscape Paintings Generation Model based on Stable Diffusion using DreamBooth

Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement

VideoDreamer: Customized Multi-Subject Text-to-Video Generation with Disen-Mix Finetuning

DreamEdit: Subject-driven Image Editing

MultiBooth: Towards Generating All Your Concepts in an Image from Text

SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation

OrientDream: Streamlining Text-to-3D Generation with Explicit Orientation Control