Toward Accurate and Realistic Outfits Visualization with Attention to Details

Kedan Li,Min jin Chong,Jeffrey Zhang,Jingen Liu
DOI: https://doi.org/10.48550/arXiv.2106.06593
2021-06-12
Abstract:Virtual try-on methods aim to generate images of fashion models wearing arbitrary combinations of garments. This is a challenging task because the generated image must appear realistic and accurately display the interaction between garments. Prior works produce images that are filled with artifacts and fail to capture important visual details necessary for commercial applications. We propose Outfit Visualization Net (OVNet) to capture these important details (e.g. buttons, shading, textures, realistic hemlines, and interactions between garments) and produce high quality multiple-garment virtual try-on images. OVNet consists of 1) a semantic layout generator and 2) an image generation pipeline using multiple coordinated warps. We train the warper to output multiple warps using a cascade loss, which refines each successive warp to focus on poorly generated regions of a previous warp and yields consistent improvements in detail. In addition, we introduce a method for matching outfits with the most suitable model and produce significant improvements for both our and other previous try-on methods. Through quantitative and qualitative analysis, we demonstrate our method generates substantially higher-quality studio images compared to prior works for multi-garment outfits. An interactive interface powered by this method has been deployed on fashion e-commerce websites and received overwhelmingly positive feedback.
Computer Vision and Pattern Recognition,Graphics,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to generate high - quality virtual try - on images of multiple - garment combinations in the virtual try - on scenario. Specifically, the existing virtual try - on methods have the following problems when generating images: 1. **Lack of details**: The generated images usually contain a lot of artifacts and cannot accurately capture important visual details of garments, such as buttons, shadows, textures, real hem lines, and the interactions between garments. These details are crucial for commercial applications because they directly affect users' purchase decisions. 2. **Difficulty in handling multiple garments**: Existing methods mainly focus on the virtual try - on of single garments, and the processing effect on multiple - garment combinations is not good, especially when dealing with garments of complex shapes. 3. **Insufficient adaptability**: Existing methods perform poorly when dealing with models of different postures, skin colors, and hand positions, and it is difficult to generate natural and consistent images. To solve the above problems, the paper proposes the Outfit Visualization Net (OVNet), which aims to generate high - quality virtual try - on images of multiple - garment combinations and can accurately capture the details and interactions of garments. OVNet consists of two parts: 1. **Semantic Layout Generator**: Predict the complete semantic layout according to the garment image, the pose map of the model, and the incomplete semantic layout. This step helps to guide the subsequent image generation process and ensure that the generated image has a reasonable structure. 2. **Multi - Warps Garment Generator**: Align the garment image with the model image through multiple coordinated warping operations and generate the final composite image. Each warping operation focuses on correcting the error areas in the previous warping operation, thereby gradually improving the quality of the generated image. In addition, the paper also introduces a garment - pose matching method, which significantly improves the quality of the generated image by selecting the most suitable model. This method is not only applicable to OVNet, but also can improve the effects of other virtual try - on methods. Through quantitative and qualitative analysis, the paper shows that OVNet is significantly superior to existing methods in generating virtual try - on images of multiple - garment combinations. This method has been applied to fashion e - commerce platforms and has received positive feedback from users.