Abstract:The advent of artificial intelligence has contributed in a groundbreaking transformation of the fashion industry, redefining creativity and innovation in unprecedented ways. This work investigates methodologies for generating tailored fashion descriptions using two distinct Large Language Models and a Stable Diffusion model for fashion image creation. Emphasizing adaptability in AI-driven fashion creativity, we depart from traditional approaches and focus on prompting techniques, such as zero-shot and few-shot learning, as well as Chain-of-Thought (CoT), which results in a variety of colors and textures, enhancing the diversity of the outputs. Central to our methodology is Retrieval-Augmented Generation (RAG), enriching models with insights from fashion sources to ensure contemporary representations. Evaluation combines quantitative metrics such as CLIPscore with qualitative human judgment, highlighting strengths in creativity, coherence, and aesthetic appeal across diverse styles. Among the participants, RAG and few-shot learning techniques are preferred for their ability to produce more relevant and appealing fashion descriptions. Our code is provided at <a class="link-external link-https" href="https://github.com/georgiarg/AutoFashion" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The paper aims to address the problem of automatic fashion image generation, specifically the automated creation of fashion descriptions and images tailored to specific styles, occasions, and individual wearers. Specifically, the research objectives include: - Developing an automated process for generating fashion images that conform to specified styles, are suitable for specific occasions, and are personalized for individual wearers. - Exploring different prompting methods to generate customized fashion descriptions that match specific variables such as style, wearer type, and occasion. - Using two large language models (LLMs) to create these descriptions and employing a Stable Diffusion model to generate corresponding fashion images based on these descriptions. - Transforming the traditional "pre-training-fine-tuning" strategy into a new "pre-training-prompting-predicting" strategy, fully relying on prompts to guide the LLMs. - Keeping the model in sync with evolving fashion trends through Retrieval-Augmented Generation (RAG) technology, which extracts insights from various sources such as fashion magazines and blogs to ensure the relevance of the generated descriptions. The main contributions of the paper are: 1. For the first time, creating an automated system capable of generating personalized fashion outfit descriptions and images that can be customized according to specific styles, occasions, and individual wearer profiles. 2. Creating a dataset containing fashion images and their descriptions generated by two different LLMs. 3. Utilizing advanced prompting techniques and RAG technology to guide the LLMs and developing optimized templates to generate fashion content. 4. Collecting and analyzing human evaluation data to assess the quality and accuracy of the fashion content generated by the proposed methods. The paper also details the experimental setup, evaluation metrics, and comparative results of different prompting techniques, ultimately demonstrating the potential of this approach in practical applications.

Automatic Generation of Fashion Images using Prompting in Generative Machine Learning Models

Prompt2Fashion: An automatically generated fashion dataset

FashionSD-X: Multimodal Fashion Garment Synthesis using Latent Diffusion

StyleSphere: Conversational Fashion Outfit Generator powered by Generative AI

AIpparel: A Large Multimodal Generative Model for Digital Garments

An intelligent generative method of fashion design combining attribute knowledge and Stable Diffusion Model

Interactive Fashion Content Generation Using LLMs and Latent Diffusion Models

Conversational Fashion Outfit Generator Powered by GenAI

Fashion-Gen: The Generative Fashion Dataset and Challenge

Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing

Towards enhanced creativity in fashion: integrating generative models with hybrid intelligence

Garment Design with Generative Adversarial Networks

Fashion Style Editing with Generative Human Prior

Fashion Clothes Generation System using Deep Convolutional GAN

Fashion++: Minimal Edits for Outfit Improvement

Toward AI fashion design: An Attribute-GAN model for clothing match

Visually-Aware Fashion Recommendation and Design with Generative Image Models

UniFashion: A Unified Vision-Language Model for Multimodal Fashion Retrieval and Generation

Design-time Fashion Popularity Forecasting in VR Environments

Large Scale Visual Recommendations From Street Fashion Images

Generative Artificial Intelligence and Design Co-Creation in Luxury New Product Development: The Power of Discarded Ideas