AnyFit: Controllable Virtual Try-on for Any Combination of Attire Across Any Scenario

Yuhan Li,Hao Zhou,Wenxiang Shang,Ran Lin,Xuanhong Chen,Bingbing Ni
2024-05-28
Abstract:While image-based virtual try-on has made significant strides, emerging approaches still fall short of delivering high-fidelity and robust fitting images across various scenarios, as their models suffer from issues of ill-fitted garment styles and quality degrading during the training process, not to mention the lack of support for various combinations of attire. Therefore, we first propose a lightweight, scalable, operator known as Hydra Block for attire combinations. This is achieved through a parallel attention mechanism that facilitates the feature injection of multiple garments from conditionally encoded branches into the main network. Secondly, to significantly enhance the model's robustness and expressiveness in real-world scenarios, we evolve its potential across diverse settings by synthesizing the residuals of multiple models, as well as implementing a mask region boost strategy to overcome the instability caused by information leakage in existing models. Equipped with the above design, AnyFit surpasses all baselines on high-resolution benchmarks and real-world data by a large gap, excelling in producing well-fitting garments replete with photorealistic and rich details. Furthermore, AnyFit's impressive performance on high-fidelity virtual try-ons in any scenario from any image, paves a new path for future research within the fashion community.
Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to achieve high - fidelity virtual try - on in various scenarios and support arbitrarily combined clothing. Existing virtual try - on methods have limitations when dealing with complex postures, occlusions, and cross - category try - ons, such as mismatched clothing details and unrealistic texture generation. In addition, these methods are usually only able to handle the try - on of a single clothing item and do not support the free combination of multiple pieces of clothing. To overcome these problems, the paper proposes AnyFit, a new virtual try - on framework, with the following main contributions: 1. **Scalability**: By introducing the Hydra Block, AnyFit can handle any number of conditions (such as different types of clothing), and the parameter increase for each additional condition is only 8%. The Hydra Block injects fine - grained clothing features from multiple conditional encoding branches into the main network through a parallel attention mechanism. 2. **Robustness**: To improve the stability and performance of the model in practical applications, AnyFit adopts multiple strategies: - **Prior Model Evolution**: By merging parameter changes within the same model family (for example, multiple fine - tuned versions of SDXL), independently evolving multiple capabilities of the base model, thereby enhancing the model's inherent potential before training. - **Adaptive Mask Boost**: By performing length enhancement on the unparsed mask area during the training stage, the model can independently understand the overall shape of the clothing, thereby reducing instability caused by information leakage. During the inference stage, the shape of the mask area is adjusted according to the aspect ratio of the target clothing to promote better try - on results, especially for long - style clothing (such as trench coats). 3. **Performance Improvement**: Experimental results show that AnyFit significantly outperforms all baseline models on high - resolution benchmark tests and real - world data, especially in generating detailed and realistic clothing. In addition, AnyFit also demonstrates strong capabilities in multi - piece clothing try - on tasks and can harmoniously combine tops and bottoms. In conclusion, AnyFit aims to provide a highly scalable and robust virtual try - on solution that can achieve a high - quality virtual try - on experience in any scenario.