An Inpainting-Infused Pipeline for Attire and Background Replacement

Felipe Rodrigues Perche-Mahlow,André Felipe-Zanella,William Alberto Cruz-Castañeda,Marcellus Amadeus

2024-02-06

Abstract:In recent years, groundbreaking advancements in Generative Artificial Intelligence (GenAI) have triggered a transformative paradigm shift, significantly influencing various domains. In this work, we specifically explore an integrated approach, leveraging advanced techniques in GenAI and computer vision emphasizing image manipulation. The methodology unfolds through several stages, including depth estimation, the creation of inpaint masks based on depth information, the generation and replacement of backgrounds utilizing Stable Diffusion in conjunction with Latent Consistency Models (LCMs), and the subsequent replacement of clothes and application of aesthetic changes through an inpainting pipeline. Experiments conducted in this study underscore the methodology's efficacy, highlighting its potential to produce visually captivating content. The convergence of these advanced techniques allows users to input photographs of individuals and manipulate them to modify clothing and background based on specific prompts without manually input inpainting masks, effectively placing the subjects within the vast landscape of creative imagination.

Computer Vision and Pattern Recognition,Artificial Intelligence,Computation and Language

What problem does this paper attempt to address?

The paper proposes an integrated approach for clothing and background replacement, primarily addressing the issues of changing a person's clothing and background in images. This method leverages generative artificial intelligence (GenAI) and computer vision technologies, particularly depth estimation, inpainting techniques, and background generation in image processing. Specifically, the study adopts the following steps: 1. **Depth Estimation and Inpainting Mask Creation**: First, the MiDaS algorithm is used for depth estimation, and an inpainting mask is created based on the depth information. Then, threshold segmentation is used to determine which parts need to be retained or modified, and facial recognition technology is combined to ensure that facial features are not altered. 2. **Background Generation and Replacement**: Stable Diffusion and Latent Consistency Models (LCMs) are used to generate new background images and replace the original background. 3. **Clothing Generation**: The inpainting model of Stable Diffusion XL is utilized to generate new clothing styles based on prompts while retaining specific areas. The experimental results demonstrate the applicability and flexibility of this method in different scenarios, effectively generating images that match specific backgrounds and clothing styles. Additionally, the paper discusses some challenges, such as the potential inaccuracy in generating hand, foot, and arm positions in certain cases. In summary, this work provides an innovative solution that allows users to easily modify the clothing and background of people in photos without manually creating complex inpainting masks, thereby greatly expanding the possibilities for creative applications.

An Inpainting-Infused Pipeline for Attire and Background Replacement

Intelli-Paint: Towards Developing Human-like Painting Agents

RealtimeGen: an Intervenable AI Image Generation System for Commercial Digital Art Asset Creators

A Progressive Image Inpainting Algorithm with a Mask Auto-update Branch

Inpaint Anything: Segment Anything Meets Image Inpainting

Personalized Face Inpainting with Diffusion Models by Parallel Visual Attention

Reimagining Reality: A Comprehensive Survey of Video Inpainting Techniques

LaFIn: Generative Landmark Guided Face Inpainting

Image Inpainting Models are Effective Tools for Instruction-guided Image Editing

Deep Interactive Video Inpainting: an Invisibility Cloak for Harry Potter.

Continuation of Famous Art with AI: A Conditional Adversarial Network Inpainting Approach

Video Inpainting of Complex Scenes

PAINT: Photo-realistic Fashion Design Synthesis

Contextual Attention Mechanism, SRGAN Based Inpainting System for Eliminating Interruptions from Images

Anywhere: A Multi-Agent Framework for Reliable and Diverse Foreground-Conditioned Image Inpainting

Guided Image Inpainting: Replacing an Image Region by Pulling Content from Another Image

Coherent and Multi-modality Image Inpainting via Latent Space Optimization

Generative 3D Animation Pipelines: Automating Facial Retargeting Workflows

Paint by Inpaint: Learning to Add Image Objects by Removing Them First

Reference-based Painterly Inpainting via Diffusion: Crossing the Wild Reference Domain Gap

Do Inpainting Yourself: Generative Facial Inpainting Guided by Exemplars