Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference

Najmeh Forouzandehmehr,Nima Farrokhsiar,Ramin Giahi,Evren Korpeoglu,Kannan Achan

2024-09-19

Abstract:Personalized outfit recommendation remains a complex challenge, demanding both fashion compatibility understanding and trend awareness. This paper presents a novel framework that harnesses the expressive power of large language models (LLMs) for this task, mitigating their "black box" and static nature through fine-tuning and direct feedback integration. We bridge the item visual-textual gap in items descriptions by employing image captioning with a Multimodal Large Language Model (MLLM). This enables the LLM to extract style and color characteristics from human-curated fashion images, forming the basis for personalized recommendations. The LLM is efficiently fine-tuned on the open-source Polyvore dataset of curated fashion images, optimizing its ability to recommend stylish outfits. A direct preference mechanism using negative examples is employed to enhance the LLM's decision-making process. This creates a self-enhancing AI feedback loop that continuously refines recommendations in line with seasonal fashion trends. Our framework is evaluated on the Polyvore dataset, demonstrating its effectiveness in two key tasks: fill-in-the-blank, and complementary item retrieval. These evaluations underline the framework's ability to generate stylish, trend-aligned outfit suggestions, continuously improving through direct feedback. The evaluation results demonstrated that our proposed framework significantly outperforms the base LLM, creating more cohesive outfits. The improved performance in these tasks underscores the proposed framework's potential to enhance the shopping experience with accurate suggestions, proving its effectiveness over the vanilla LLM based outfit generation.

Information Retrieval,Artificial Intelligence,Machine Learning

What problem does this paper attempt to address?

The paper aims to address the issue of personalized clothing matching recommendations. Specifically, the goal of the research is to create an automated personalized clothing matching recommendation system that not only understands the compatibility between clothing items but also perceives current fashion trends and makes recommendations based on users' personal preferences. To achieve this goal, the paper proposes a novel framework that leverages the powerful expressive capabilities of large-scale language models (LLMs) to accomplish this task, and overcomes the "black box" nature and static characteristics of these models through fine-tuning and direct feedback integration. Additionally, the framework addresses the visual-text gap in item descriptions by using multimodal large language models (MLLMs) for image caption generation, enabling the LLM to extract style and color features from human-curated fashion images, thus laying the foundation for personalized recommendations. The framework is efficiently fine-tuned on the open-source Polyvore dataset to optimize its ability to recommend fashionable clothing and employs a direct preference mechanism to enhance the LLM's decision-making process, forming a self-reinforcing AI feedback loop that continuously improves recommendation results based on seasonal fashion trends. Experimental results show that the framework significantly outperforms the baseline LLM models in key tasks.

Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference

Character-based Outfit Generation with Vision-augmented Style Extraction via LLMs

Learning Fashion Compatibility with Bidirectional LSTMs

Sequential LLM Framework for Fashion Recommendation

Enhancing Visual Fashion Recommendations with Users in the Loop

Lost Your Style? Navigating with Semantic-Level Approach for Text-to-Outfit Retrieval

FashionLOGO: Prompting Multimodal Large Language Models for Fashion Logo Embeddings

Visually-Aware Personalized Recommendation using Interpretable Image Representations

Fashion++: Minimal Edits for Outfit Improvement

Fashion Recommendation and Compatibility Prediction Using Relational Network

Personalized Outfit Recommendation with Learnable Anchors.

Review of Personalized Outfit Recommender

Personalized Recommendation Systems Powered By Large Language Models: Integrating Semantic Understanding and User Preferences

Large Scale Visual Recommendations From Street Fashion Images

Learning Outfit Compatibility with Graph Attention Network and Visual-Semantic Embedding.

Personalized Fashion Design

Styling with Attention to Details

Low-Rank Regularized Multi-Representation Learning for Fashion Compatibility Prediction.

Recommending Outfits from Personal Closet

Modality-Oriented Graph Learning Toward Outfit Compatibility Modeling