Clothing Generation by Multi-Modal Embedding: A Compatibility Matrix-Regularized GAN Model.

Linlin Liu,Haijun Zhang,Dongliang Zhou
DOI: https://doi.org/10.1016/j.imavis.2021.104097
IF: 3.86
2021-01-01
Image and Vision Computing
Abstract:Clothing compatibility learning has gained increasing research attention due to the fact that a properly coordinated outfit can represent personality and improve an individual's appearance greatly. In this paper, we propose a Compatibility Matrix-Regularized Generative Adversarial Network (CMRGAN) for compatible item generation. In particular, we utilize a multi-modal embedding to transform the image and text information of an input clothing item into a latent feature code. Sequentially, compatibility learning among latent features is performed to obtain a compatibility style space. The feature of the input image is then regularized by the style space. Finally, a compatible clothing image is generated by a decoder which is fed by the regularized features. To verify the proposed model, we train an Inception-v3 classification model to evaluate the authenticity of synthesized images, a regression scoring VGG model to measure the compatibility degree of the generated image pairs and a deep attentional multimodal similarity model to evaluate the semantic similarity between generated images and ground truth text descriptions. In order to give an objective evaluation, these models are trained based on datasets consisting of fashion data only. The results demonstrate the effectiveness of the proposed method on image-to-image translation based on compatibility space.
What problem does this paper attempt to address?