RankFlow

Jiarui Qin,Jiachen Zhu,Bo Chen,Zhirong Liu,Weiwen Liu,Ruiming Tang,Rui Zhang,Yong Yu,Weinan Zhang
DOI: https://doi.org/10.1145/3477495.3532050
2022-01-01
Abstract:Building a multi-stage cascade ranking system is a commonly used solution to balance the efficiency and effectiveness in modern information retrieval (IR) applications, such as recommendation and web search. Despite the popularity in practice, the literature specific on multi-stage cascade ranking systems is relatively scarce. The common practice is to train rankers of each stage independently using the same user feedback data (a.k.a., impression data), disregarding the data flow and the possible interactions between stages. This straightforward solution could lead to a sub-optimal system because of the sample selection bias (SSB) issue, which is especially damaging for cascade rankers due to the negative effect accumulated in the multiple stages. Worse still, the interactions between the rankers of each stage are not fully exploited. This paper provides an elaborate analysis of this commonly used solution to reveal its limitations. By studying the essence of cascade ranking, we propose a joint training framework named RankFlow to alleviate the SSB issue and exploit the interactions between the cascade rankers, which is the first systematic solution for this topic. We propose a paradigm of training cascade rankers that emphasizes the importance of fitting rankers on stage-specific data distributions instead of the unified user feedback distribution. We design the RankFlow framework based on this paradigm: The training data of each stage is generated by its preceding stages while the guidance signals not only come from the logs but its successors. Extensive experiments are conducted on various IR scenarios, including recommendation, web search and advertisement. The results verify the efficacy and superiority of RankFlow.
What problem does this paper attempt to address?