Down to the Last Detail: Virtual Try-on with Fine-grained Details

Jiahang Wang,Tong Sha,Wei Zhang,Zhoujun Li,Tao Mei
DOI: https://doi.org/10.1145/3394171.3413514
2020-01-01
Abstract:Virtual try-on has attracted lots of research attention due to its potential applications in e-commerce, virtual reality and fashion design. However, existing methods can hardly preserve the fine-grained details (e.g., clothing texture, facial identity, hair style, skin tone) during generation, due to the non-rigid body deformation and multi-scale details. In this work, we propose a multi-stage framework to synthesize person images, where fine-grained details can be well preserved. To address the long-range translation and rich-details generation, we propose a Tree-Block (tree dilated fusion block) to replace standard ResNet-block where applicable. Notably, multi-scale feature maps can be smoothly fused for fine-grained detail generation, by incorporating larger spatial context at multiple scales. With a delicate end-to-end training scheme, our whole framework can be jointly optimized for results with significantly better visual fidelity and richer details. Moreover, we also explore the potential application in video-based virtual try-on. By harnessing the well-trained image generator and an extra video-level adaptor, a model photo can be well animated with a driving pose sequence. Extensive evaluations on standard datasets and user study demonstrate that our proposed framework achieves the state-of-the-art results, especially in preserving visual details in clothing texture and facial identity. Our implementation is publicly available via https://github.com/JDAI-CV/Down-to-the-Last-Detail-Virtual-Try-on-with-Detail-Carving.
What problem does this paper attempt to address?