CLOTHING FASHION IMAGE GENERATION FROM TEXT USING ARTIFICIAL INTELLIGENCE

Aqsa Shaheen,Dr. Javed Iqbal,
DOI: https://doi.org/10.33564/ijeast.2023.v08i02.001
2023-06-01
International Journal of Engineering Applied Sciences and Technology
Abstract:Development of dynamic, intensely engaging, and fascinating images has greatly benefited from the recent exponential advancements in image synthesis techniques. The architecture proposed in this research allows users to enter text regarding a particular dress, and the model then create images of fashionable apparel based on that content. The model suggested can let people become their own fashion designers by utilizing the strength of Deep Learning and Artificial intelligence to create a variety of fashionable outfits for themselves. DALL-E model is utilized to engender realistic images based on text description. DALL-E is an artificial intelligence model that generates realistic images from a description in natural language. While there are alternative text-to-image systems, the DALL-E produces far more coherent visuals. The world and the relationships between objects appear to be well understood by this technology. DALL-E uses GPT-3 model and dataset of textimage pairs for image synthesis. Image is encoded into size of 32×32 grid using VQ-VAE. Then image and text are combined together in the form of single stream for training of DALL-E. Deep Fashion dataset is used for training of DALL-E, which is simply more realistic dataset and contains High definition images that further enable accurate generation. After training DALL-E produce more accurate results and provides higher inception score than preceding models.
What problem does this paper attempt to address?