Pairwise Ranking Loss for Multi-Task Learning in Recommender Systems

Furkan Durmus,Hasan Saribas,Said Aldemir,Junyan Yang,Hakan Cevikalp
2024-06-05
Abstract:Multi-Task Learning (MTL) plays a crucial role in real-world advertising applications such as recommender systems, aiming to achieve robust representations while minimizing resource consumption. MTL endeavors to simultaneously optimize multiple tasks to construct a unified model serving diverse objectives. In online advertising systems, tasks like Click-Through Rate (CTR) and Conversion Rate (CVR) are often treated as MTL problems concurrently. However, it has been overlooked that a conversion ($y_{cvr}=1$) necessitates a preceding click ($y_{ctr}=1$). In other words, while certain CTR tasks are associated with corresponding conversions, others lack such associations. Moreover, the likelihood of noise is significantly higher in CTR tasks where conversions do not occur compared to those where they do, and existing methods lack the ability to differentiate between these two scenarios. In this study, exposure labels corresponding to conversions are regarded as definitive indicators, and a novel task-specific loss is introduced by calculating a \textbf{p}air\textbf{wise} \textbf{r}anking (PWiseR) loss between model predictions, manifesting as pairwise ranking loss, to encourage the model to rely more on them. To demonstrate the effect of the proposed loss function, experiments were conducted on different MTL and Single-Task Learning (STL) models using four distinct public MTL datasets, namely Alibaba FR, NL, US, and CCP, along with a proprietary industrial dataset. The results indicate that our proposed loss function outperforms the BCE loss function in most cases in terms of the AUC metric.
Information Retrieval
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in recommendation systems, especially in online advertising systems, multi - task learning (MTL) models fail to effectively distinguish the differences between click - through samples with and without conversions when dealing with click - through rate (CTR) and conversion rate (CVR) predictions. Specifically: 1. **Causality problem**: Conversion ($y_{cvr} = 1$) must be preceded by a click ($y_{ctr} = 1$). However, existing methods fail to explicitly model this causal relationship, resulting in poor performance of the model when dealing with click - through samples with and without conversions. 2. **Noise problem**: In click - through samples without conversions, the probability of noise is higher. For example, these clicks may be caused by bot traffic, misoperations or click fraud. Therefore, the reliability of these samples is lower, but existing methods cannot effectively distinguish these noisy samples from valuable samples. 3. **Ranking problem**: In the advertising recommendation system, samples with conversions usually have a higher effective cost per mille (eCPM), so they should be given a higher weight. However, the traditional binary cross - entropy (BCE) loss function cannot effectively distinguish these samples, resulting in inaccurate rankings. To solve these problems, the author proposes a new task - specific pairwise ranking loss (PWiseR loss). This loss function encourages the model to pay more attention to samples with conversions by introducing pairwise ranking loss, thereby improving the prediction accuracy and noise resistance of the model. Specifically, the PWiseR loss function aims to: - Assign a higher weight to samples with conversions to ensure that these samples receive more attention during the training process. - Distinguish between click - through samples with and without conversions and reduce the impact of noise on the model. - Improve the ranking quality of advertising candidate items and give priority to recommending advertisements that are more likely to bring conversions. In this way, the PWiseR loss function can better capture the internal relationship between CTR and CVR tasks, thereby improving the overall performance of the multi - task learning model.