User-tag Profile Modeling in Recommendation System Via Contrast Weighted Tag Masking

Chenxu Zhu,Peng Du,Xianghui Zhu,Weinan Zhang,Yong Yu,Yang Cao
DOI: https://doi.org/10.1145/3534678.3539102
2022-01-01
Abstract:User-tag profile modeling has become one of the novel and significant trends for the future development of industrial recommendation systems, which can be divided into two fundamental tasks: User Preferred Tag (UPT) and Tag Preferred User (TPU) in practical scenarios. In most existing deep learning models for user-tag profiling, the network inputs all the combined tags of the item with the user features when training but inputs only one tag with the user feature to evaluate the user's preference on a single tag when testing. This leads to data discrepancy between the training and testing samples. To address such an issue, we attempt a novel Random Masking Model (RMM) to remain only one tag at the training time by masking. However, it causes two other serious downsides. First, not all tags attached to the same item are equally predictive. Irrelevant tags may introduce noisy signals and thus cause performance degradation. Second, it neglects the impact of combined tags aggregated together, which may be an essential factor leading to user clicks. Therefore, we further propose a framework called Contrast Weighted Tag Masking (CWTM) in this work, which tackles these two issues with two modules: (i) Weighted Masking Module (WMM) introduces the importance network to compute a score for each tag attached to the item and then samples from these tags weightedly according to the score; (ii) Contrast Module (CM) makes use of a contrastive learning architecture to inherit and distill some understanding about the effect of aggregated tags. Offline experiments on four datasets (three public datasets and one proprietary industrial dataset) demonstrate the superiority and effectiveness of CWTM over the state-of-the-art baselines. Moreover, CWTM has been deployed on the training platform of Alibaba advertising systems and achieved substantial improvements of ROI and CVR by 16.8% and 9.6%, respectively.
What problem does this paper attempt to address?