Graphite: A Graph-based Extreme Multi-Label Short Text Classifier for Keyphrase Recommendation

Ashirbad Mishra,Soumik Dey,Jinyu Zhao,Marshall Wu,Binbin Li,Kamesh Madduri
2024-07-30
Abstract:Keyphrase Recommendation has been a pivotal problem in advertising and e-commerce where advertisers/sellers are recommended keyphrases (search queries) to bid on to increase their sales. It is a challenging task due to the plethora of items shown on online platforms and various possible queries that users search while showing varying interest in the displayed items. Moreover, query/keyphrase recommendations need to be made in real-time and in a resource-constrained environment. This problem can be framed as an Extreme Multi-label (XML) Short text classification by tagging the input text with keywords as labels. Traditional neural network models are either infeasible or have slower inference latency due to large label spaces. We present Graphite, a graph-based classifier model that provides real-time keyphrase recommendations that are on par with standard text classification models. Furthermore, it doesn't utilize GPU resources, which can be limited in production environments. Due to its lightweight nature and smaller footprint, it can train on very large datasets, where state-of-the-art XML models fail due to extreme resource requirements. Graphite is deterministic, transparent, and intrinsically more interpretable than neural network-based models. We present a comprehensive analysis of our model's performance across forty categories spanning eBay's English-speaking sites.
Information Retrieval,Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the problem of recommending keywords (i.e., search queries) in advertising and e-commerce to increase sales. Specifically, it focuses on how to recommend keywords related to products in real-time from a large number of possible queries in a resource-constrained environment. This problem is defined as an extreme multi-label short text classification task, which is achieved by tagging the input text with keywords for recommendation. Traditional neural network models are either infeasible or have high inference latency for such large-scale label space tasks, so the paper proposes a graph-based classifier model—Graphite, aimed at overcoming these issues. The main features of the Graphite model include: 1. **Real-time recommendation**: Capable of providing real-time keyword recommendations comparable to standard text classification models. 2. **No dependency on GPU resources**: Suitable for resource-constrained production environments. 3. **Lightweight and small memory footprint**: Can train on very large datasets without requiring extremely high resources like state-of-the-art extreme multi-label models. 4. **Deterministic, transparent, and interpretable**: More interpretable than neural network-based models. The paper also details the construction and inference steps of the Graphite model and demonstrates through experiments that its performance on the eBay dataset surpasses other existing models, especially in resource-constrained scenarios.