Outline-Guided Object Inpainting with Diffusion Models

Markus Pobitzer,Filip Janicki,Mattia Rigotti,Cristiano Malossi
2024-02-26
Abstract:Instance segmentation datasets play a crucial role in training accurate and robust computer vision models. However, obtaining accurate mask annotations to produce high-quality segmentation datasets is a costly and labor-intensive process. In this work, we show how this issue can be mitigated by starting with small annotated instance segmentation datasets and augmenting them to effectively obtain a sizeable annotated dataset. We achieve that by creating variations of the available annotated object instances in a way that preserves the provided mask annotations, thereby resulting in new image-mask pairs to be added to the set of annotated images. Specifically, we generate new images using a diffusion-based inpainting model to fill out the masked area with a desired object class by guiding the diffusion through the object outline. We show that the object outline provides a simple, but also reliable and convenient training-free guidance signal for the underlying inpainting model that is often sufficient to fill out the mask with an object of the correct class without further text guidance and preserve the correspondence between generated images and the mask annotations with high precision. Our experimental results reveal that our method successfully generates realistic variations of object instances, preserving their shape characteristics while introducing diversity within the augmented area. We also show that the proposed method can naturally be combined with text guidance and other image augmentation techniques.
Computer Science
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to enhance the existing instance segmentation datasets by generating new images using Diffusion Models, so as to reduce the annotation cost and workload. Specifically, the author proposes an image inpainting method based on the diffusion model. The model fills the occluded areas guided by the object contours, thereby generating new images and corresponding mask annotation pairs. These newly generated data can be used to expand the training dataset. This method can not only maintain the consistency between the generated image and the original mask annotation, but also effectively generate the correct object category without introducing additional text guidance, while retaining the shape features of the object and increasing the diversity of the dataset. The main contributions of the paper include: 1. Proposing a method of using the diffusion model to create synthetic object variants, which can closely follow the contours of the original object while retaining the original object mask. 2. Demonstrating the effectiveness of the object contour as a simple, indirect but robust guiding mechanism for the latent diffusion inpainting model. 3. Demonstrating the adaptability of this guiding mechanism under various image operations, such as scaling, flipping and color change, and showing its application in data augmentation. Through this method, researchers can significantly increase the size of the instance segmentation dataset while maintaining high quality, which is helpful for training more accurate and robust computer vision models.