Multi-objective Learning to Rank by Model Distillation

Jie Tang,Huiji Gao,Liwei He,Sanjeev Katariya

DOI: https://doi.org/10.1145/3637528.3671597

2024-07-10

Abstract:In online marketplaces, search ranking's objective is not only to purchase or conversion (primary objective), but to also the purchase outcomes(secondary objectives), e.g. order cancellation(or return), review rating, customer service inquiries, platform long term growth. Multi-objective learning to rank has been widely studied to balance primary and secondary objectives. But traditional approaches in industry face some challenges including expensive parameter tuning leads to sub-optimal solution, suffering from imbalanced data sparsity issue, and being not compatible with ad-hoc objective. In this paper, we propose a distillation-based ranking solution for multi-objective ranking, which optimizes the end-to-end ranking system at Airbnb across multiple ranking models on different objectives along with various considerations to optimize training and serving efficiency to meet industry standards. We found it performs much better than traditional approaches, it doesn't only significantly increases primary objective by a large margin but also meet secondary objectives constraints and improve model stability. We also demonstrated the proposed system could be further simplified by model self-distillation. Besides this, we did additional simulations to show that this approach could also help us efficiently inject ad-hoc non-differentiable business objective into the ranking system while enabling us to balance our optimization objectives.

Information Retrieval

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how the search ranking algorithm in the online market balances the primary goals (such as purchases or conversion rates) with the secondary goals (such as order cancellations, returns, review scores, customer service inquiries, long - term platform growth, etc.). Traditional methods are costly in parameter adjustment and may lead to sub - optimal solutions. At the same time, they face the problem of data sparsity imbalance and are difficult to adapt to temporary goals. To solve these problems, the paper proposes a multi - objective learning - to - rank solution based on model distillation, aiming to optimize the end - to - end ranking system on Airbnb. By using multiple ranking models for different goals, while considering training and deployment efficiency to meet industry standards. This method not only significantly improves the primary goals but also meets the constraints of the secondary goals and improves the stability of the model. In addition, the paper also shows that the proposed system can be further simplified through model self - distillation and can effectively inject non - differentiable business goals into the ranking system while achieving a balance between optimization goals.

Multi-objective Learning to Rank by Model Distillation

A Multi-Objective Learning to re-Rank Approach to Optimize Online Marketplaces for Multiple Stakeholders

RD-Suite: A Benchmark for Ranking Distillation

An Empirical Study of Uniform-Architecture Knowledge Distillation in Document Ranking

Learning to Collaborate: Multi-Scenario Ranking Via Multi-Agent Reinforcement Learning.

Ranking Policy Learning via Marketplace Expected Value Estimation From Observational Data

Instruction Distillation Makes Large Language Models Efficient Zero-shot Rankers

AliExpress Learning-To-Rank: Maximizing Online Model Performance without Going Online

Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application

Controllable Multi-Objective Re-ranking with Policy Hypernetworks

Learning To Rank Diversely At Airbnb

Learning Fair Ranking Policies via Differentiable Optimization of Ordered Weighted Averages

Improving Multi-Scenario Learning to Rank in E-commerce by Exploiting Task Relationships in the Label Space

Orbit: A Framework for Designing and Evaluating Multi-objective Rankers

Toward Understanding Privileged Features Distillation in Learning-to-Rank

Ranking Distillation: Learning Compact Ranking Models With High Performance for Recommender System

Deep Pareto Reinforcement Learning for Multi-Objective Recommender Systems

Scalable Learning of Non-Decomposable Objectives

PILE: Pairwise Iterative Logits Ensemble for Multi-Teacher Labeled Distillation

An Efficient Combinatorial Optimization Model Using Learning-to-Rank Distillation

Residual Multi-Task Learner for Applied Ranking