PAE: LLM-based Product Attribute Extraction for E-Commerce Fashion Trends

Apurva Sinha,Ekta Gujral

2024-05-28

Abstract:Product attribute extraction is an growing field in e-commerce business, with several applications including product ranking, product recommendation, future assortment planning and improving online shopping customer experiences. Understanding the customer needs is critical part of online business, specifically fashion products. Retailers uses assortment planning to determine the mix of products to offer in each store and channel, stay responsive to market dynamics and to manage inventory and catalogs. The goal is to offer the right styles, in the right sizes and colors, through the right channels. When shoppers find products that meet their needs and desires, they are more likely to return for future purchases, fostering customer loyalty. Product attributes are a key factor in assortment planning. In this paper we present PAE, a product attribute extraction algorithm for future trend reports consisting text and images in PDF format. Most existing methods focus on attribute extraction from titles or product descriptions or utilize visual information from existing product images. Compared to the prior works, our work focuses on attribute extraction from PDF files where upcoming fashion trends are explained. This work proposes a more comprehensive framework that fully utilizes the different modalities for attribute extraction and help retailers to plan the assortment in advance. Our contributions are three-fold: (a) We develop PAE, an efficient framework to extract attributes from unstructured data (text and images); (b) We provide catalog matching methodology based on BERT representations to discover the existing attributes using upcoming attribute values; (c) We conduct extensive experiments with several baselines and show that PAE is an effective, flexible and on par or superior (avg 92.5% F1-Score) framework to existing state-of-the-art for attribute value extraction task.

Artificial Intelligence,Computer Vision and Pattern Recognition,Machine Learning

What problem does this paper attempt to address?

The paper proposes a solution to the problem of product attribute extraction in e-commerce, specifically focusing on text and image data in fashion trend reports. Existing methods mostly focus on extracting attributes from titles or product descriptions, or using visual information from existing product images. This work, however, focuses on extracting attributes from PDF files that contain future fashion trend explanations, which helps retailers plan their product assortments in advance. The main contributions of the paper include: 1. The proposal of an algorithm framework named PAE (Product Attribute Extraction), which efficiently extracts attributes from unstructured text and image data. 2. The introduction of a catalog matching approach based on BERT representation to discover existing attributes and use upcoming attribute values. 3. Extensive experiments were conducted, comparing with multiple baseline models, proving that PAE performs as well as or even better than state-of-the-art methods in attribute value extraction tasks, with an average F1 score of 92.5%. The paper discusses the challenges of extracting text and images from PDF files, such as text misspelling, loss of image quality, and multi-label attribute recognition, and proposes corresponding solutions. By matching the extracted attributes with the product catalog, the quality of search tags can be improved, leading to an enhanced shopping experience for customers. In addition, the paper discusses how to utilize unsupervised models and extract interpretable visual attributes from unlabeled data. The experimental results show that PAE achieves high precision on multiple datasets, with F1 scores exceeding 90% for both text and images, demonstrating its effectiveness and flexibility.

PAE: LLM-based Product Attribute Extraction for E-Commerce Fashion Trends

Large Scale Generative Multimodal Attribute Extraction for E-commerce Attributes

Unsupervised Extraction of Common Product Attributes From E-Commerce Websites by Considering Client Suggestion

Attribute Extraction from Product Titles in eCommerce

Enhanced E-Commerce Attribute Extraction: Innovating with Decorative Relation Correction and LLAMA 2.0-Based Annotation

PAM: Understanding Product Images in Cross Product Category Attribute Extraction

Deep Recurrent Neural Networks for Product Attribute Extraction in eCommerce

Using LLMs for the Extraction and Normalization of Product Attribute Values

Visually Similar Products Retrieval for Shopsy

Multimodal Joint Attribute Prediction and Value Extraction for E-commerce Product

JPAVE: A Generation and Classification-based Model for Joint Product Attribute Prediction and Value Extraction

Multi-Label Zero-Shot Product Attribute-Value Extraction

EAVE: Efficient Product Attribute Value Extraction via Lightweight Sparse-layer Interaction

When relevance is not Enough: Promoting Visual Attractiveness for Fashion E-commerce

Attr2Style: A Transfer Learning Approach for Inferring Fashion Styles via Apparel Attributes

Progressive Fashion Attribute Extraction

AI Tailoring: Evaluating Influence of Image Features on Fashion Product Popularity

AI Assisted Apparel Design

Using Artificial Intelligence to Analyze Fashion Trends

LaTeX-Numeric: Language-agnostic Text attribute eXtraction for E-commerce Numeric Attributes

Boosting Multi-Modal E-commerce Attribute Value Extraction via Unified Learning Scheme and Dynamic Range Minimization