DM-VTON: Distilled Mobile Real-time Virtual Try-On

Khoi-Nguyen Nguyen-Ngoc,Thanh-Tung Phan-Nguyen,Khanh-Duy Le,Tam V. Nguyen,Minh-Triet Tran,Trung-Nghia Le

2023-08-26

Abstract:The fashion e-commerce industry has witnessed significant growth in recent years, prompting exploring image-based virtual try-on techniques to incorporate Augmented Reality (AR) experiences into online shopping platforms. However, existing research has primarily overlooked a crucial aspect - the runtime of the underlying machine-learning model. While existing methods prioritize enhancing output quality, they often disregard the execution time, which restricts their applications on a limited range of devices. To address this gap, we propose Distilled Mobile Real-time Virtual Try-On (DM-VTON), a novel virtual try-on framework designed to achieve simplicity and efficiency. Our approach is based on a knowledge distillation scheme that leverages a strong Teacher network as supervision to guide a Student network without relying on human parsing. Notably, we introduce an efficient Mobile Generative Module within the Student network, significantly reducing the runtime while ensuring high-quality output. Additionally, we propose Virtual Try-on-guided Pose for Data Synthesis to address the limited pose variation observed in training images. Experimental results show that the proposed method can achieve 40 frames per second on a single Nvidia Tesla T4 GPU and only take up 37 MB of memory while producing almost the same output quality as other state-of-the-art methods. DM-VTON stands poised to facilitate the advancement of real-time AR applications, in addition to the generation of lifelike attired human figures tailored for diverse specialized training tasks. <a class="link-external link-https" href="https://sites.google.com/view/ltnghia/research/DMVTON" rel="external noopener nofollow">this https URL</a>

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is the deficiencies in the execution speed and memory consumption of existing virtual try - on technologies. Specifically, although existing image - based virtual try - on methods can generate high - quality try - on results, they usually take a long time to process and occupy a large amount of memory resources, which limits the scope of use of these methods in real - time applications, especially on mobile devices. To solve these problems, the author proposes a new framework named Distilled Mobile Real - time Virtual Try - On (DM - VTON). This framework uses the knowledge distillation technique, with a powerful teacher network guiding the learning process of the student network, while the student network adopts a lightweight design to reduce the running time and memory consumption while maintaining high output quality. In addition, to address the problem of limited human pose changes in the training data, the author also proposes a data synthesis pipeline named Virtual Try - on - guided Pose for Data Synthesis (VTP - DS) to enrich the pose diversity in the training data. In short, DM - VTON aims to improve the real - time performance and resource efficiency of virtual try - on technology, making it more suitable for running on resource - constrained devices such as smart phones and tablets, thereby improving the user's online shopping experience.

DM-VTON: Distilled Mobile Real-time Virtual Try-On

Toward Realistic Virtual Try-on Through Landmark Guided Shape Matching

DP-VTON: Toward Detail-Preserving Image-Based Virtual Try-on Network

DH-VTON: Deep Text-Driven Virtual Try-On via Hybrid Attention Learning

MT-VTON: Multilevel Transformation-Based Virtual Try-On for Enhancing Realism of Clothing

Fashion-VDM: Video Diffusion Model for Virtual Try-On

Time-Efficient and Identity-Consistent Virtual Try-On Using A Variant of Altered Diffusion Models

C-VTON: Context-Driven Image-Based Virtual Try-On Network

VTNCT: an Image-Based Virtual Try-on Network by Combining Feature with Pixel Transformation

ACDG-VTON: Accurate and Contained Diffusion Generation for Virtual Try-On

PG-VTON: A Novel Image-Based Virtual Try-On Method Via Progressive Inference Paradigm

Improving Diffusion Models for Authentic Virtual Try-on in the Wild

Toward Detail-Oriented Image-Based Virtual Try-On with Arbitrary Poses

UF-VTON: Toward User-Friendly Virtual Try-On Network

StyleVTON: A multi-pose virtual try-on with identity and clothing detail preservation

DreamVTON: Customizing 3D Virtual Try-on with Personalized Diffusion Models

VTON-MP: Multi-Pose Virtual Try-On Via Appearance Flow and Feature Filtering

LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On

Improving Diffusion Models for Virtual Try-on

OutfitAnyone: Ultra-high Quality Virtual Try-On for Any Clothing and Any Person

StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On