Optimizing E-commerce Search: Toward a Generalizable and Rank-Consistent Pre-Ranking Model

Enqiang Xu,Yiming Qiu,Junyang Bai,Ping Zhang,Dadong Miao,Songlin Wang,Guoyu Tang,Lin Liu,Mingming Li
DOI: https://doi.org/10.1145/3626772.3661343
2024-08-21
Abstract:In large e-commerce platforms, search systems are typically composed of a series of modules, including recall, pre-ranking, and ranking phases. The pre-ranking phase, serving as a lightweight module, is crucial for filtering out the bulk of products in advance for the downstream ranking module. Industrial efforts on optimizing the pre-ranking model have predominantly focused on enhancing ranking consistency, model structure, and generalization towards long-tail items. Beyond these optimizations, meeting the system performance requirements presents a significant challenge. Contrasting with existing industry works, we propose a novel method: a Generalizable and RAnk-ConsistEnt Pre-Ranking Model (GRACE), which achieves: 1) Ranking consistency by introducing multiple binary classification tasks that predict whether a product is within the top-k results as estimated by the ranking model, which facilitates the addition of learning objectives on common point-wise ranking models; 2) Generalizability through contrastive learning of representation for all products by pre-training on a subset of ranking product embeddings; 3) Ease of implementation in feature construction and online deployment. Our extensive experiments demonstrate significant improvements in both offline metrics and online A/B test: a 0.75% increase in AUC and a 1.28% increase in CVR.
Information Retrieval,Machine Learning
What problem does this paper attempt to address?
The paper aims to address two key issues of pre-ranking models in e-commerce search systems: ranking consistency and generalization ability. 1. **Ranking Consistency**: The goal of the pre-ranking model is to filter out a large number of items early in the system to save computational resources. To achieve performance closer to the downstream ranking model, the paper proposes an innovative approach by introducing multiple binary classification tasks to predict whether an item is among the top k results estimated by the ranking model, thereby enhancing ranking consistency. This method simplifies the implementation process without modifying existing training data or workflows. 2. **Generalization Ability**: The pre-ranking model needs to handle a large number of long-tail items, while traditional ranking models usually focus only on the ranking within the pre-ranking output sequence. Therefore, the paper proposes a method that combines hash ID embeddings and attribute embeddings, enhancing item representation through contrastive learning from embeddings generated by a pre-trained graph neural network, thereby improving generalization ability. With these optimizations, the proposed GRACE model significantly improves the AUC metric in offline experiments and increases the conversion rate (CVR) and gross merchandise volume (GMV) in online A/B tests, with particularly notable performance on long-tail items.