Abstract:Re-ranking draws increased attention on both academics and industries, which rearranges the ranking list by modeling the mutual influence among items to better meet users' demands. Many existing re-ranking methods directly take the initial ranking list as input, and generate the optimal permutation through a well-designed context-wise model, which brings the evaluation-before-reranking problem. Meanwhile, evaluating all candidate permutations brings unacceptable computational costs in practice. Thus, to better balance efficiency and effectiveness, online systems usually use a two-stage architecture which uses some heuristic methods such as beam-search to generate a suitable amount of candidate permutations firstly, which are then fed into the evaluation model to get the optimal permutation. However, existing methods in both stages can be improved through the following aspects. As for generation stage, heuristic methods only use point-wise prediction scores and lack an effective judgment. As for evaluation stage, most existing context-wise evaluation models only consider the item context and lack more fine-grained feature context modeling. This paper presents a novel end-to-end re-ranking framework named PIER to tackle the above challenges which still follows the two-stage architecture and contains two mainly modules named FPSM and OCPM. We apply SimHash in FPSM to select top-K candidates from the full permutation based on user's permutation-level interest in an efficient way. Then we design a novel omnidirectional attention mechanism in OCPM to capture the context information in the permutation. Finally, we jointly train these two modules end-to-end by introducing a comparative learning loss. Offline experiment results demonstrate that PIER outperforms baseline models on both public and industrial datasets, and we have successfully deployed PIER on Meituan food delivery platform.

Context-Aware Ranking by Constructing a Virtual Environment for Reinforcement Learning

User Behavior Simulation for Search Result Re-ranking

Multi Page Search with Reinforcement Learning to Rank

Learning a Deep Listwise Context Model for Ranking Refinement

Reinforcement Learning to Rank with Markov Decision Process

Learning to Collaborate: Multi-Scenario Ranking Via Multi-Agent Reinforcement Learning.

A Simple yet Effective Framework for Active Learning to Rank

Online Learning to Rank in a Listwise Approach for Information Retrieval

HyQE: Ranking Contexts with Hypothetical Query Embeddings

Learning to rank relational objects and its application to web search.

RLPS: A Reinforcement Learning–Based Framework for Personalized Search

Investigating Weak Supervision in Deep Ranking.

A Deep Reinforcement Learning Approach for Interactive Search with Sentence-level Feedback

AliExpress Learning-To-Rank: Maximizing Online Model Performance without Going Online

Pre-trained Language Model based Ranking in Baidu Search

Towards Off-Policy Reinforcement Learning for Ranking Policies with Human Feedback

Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application

CTR is not Enough: a Novel Reinforcement Learning based Ranking Approach for Optimizing Session Clicks

Interactive Search Based on Deep Reinforcement Learning

PIER: Permutation-Level Interest-Based End-to-End Re-ranking Framework in E-commerce

Is learning to rank effective for Web search?