Cross-Domain Recommendation to Cold-Start Users via Variational Information Bottleneck

Xin Cong,Jiawei Sheng,Jiangxia Cao,Bin Wang,Tingwen Liu
DOI: https://doi.org/10.48550/arXiv.2203.16863
2022-03-31
Abstract:Recommender systems have been widely deployed in many real-world applications, but usually suffer from the long-standing user cold-start problem. As a promising way, Cross-Domain Recommendation (CDR) has attracted a surge of inter-est, which aims to transfer the user preferences observed in the source domain to make recommendations in the target domain. Previous CDR approaches mostly achieve the goal by following the Embedding and Mapping (EMCDR) idea which attempts to learn a mapping function to transfer the pre-trained user repre-sentations (embeddings) from the source domain into the target domain. However, they pre-train the user/item representations independently for each domain, ignoring to consider both domain interactions simultaneously. Therefore, the biased pre-trained representations inevitably involve the domain-specific information which may lead to negative impact to transfer information across domains. In this work, we consider a key point of the CDR task: what information needs to be shared across domains? To achieve the above idea, this paper utilizes the information bottleneck (IB) principle, and proposes a novel approach termed as CDRIB to enforce the representations encoding the domain-shared information. To derive the unbiased representations, we devise two IB regularizers to model the cross-domain/in-domain user-item interactions simultaneously and thereby CDRIB could consider both domain interactions jointly for de-biasing. With an additional contrastive information regularizer, CDRIB can also capture cross-domain user-user correlations. In this way, those regularizers encourage the representations to encode the domain-shared information, which has the capability to make recommendations in both domains directly. To the best of our knowledge, this paper is the first work to capture the domain-shared information for cold-start users via variational information bottleneck. Empirical experiments illustrate that CDRIB outperforms the state-of-the-art approaches on four real-world cross-domain datasets, demonstrating the effectiveness of adopting the information bottleneck for CDR.
Computer Science
What problem does this paper attempt to address?