SEMI: A Sequential Multi-Modal Information Transfer Network for E-Commerce Micro-Video Recommendations

Chenyi Lei,Yong Liu,Lingzi Zhang,Guoxin Wang,Haihong Tang,Houqiang Li,Chunyan Miao
DOI: https://doi.org/10.1145/3447548.3467189
2021-08-14
Abstract:The micro-video recommendation system becomes an essential part of the e-commerce platform, which helps disseminate micro-videos to potentially interested users. Existing micro-video recommendation methods only focus on users' browsing behaviors on micro-videos, but ignore their purchasing intentions in the e-commerce environment. Thus, they usually achieve unsatisfied e-commerce micro-video recommendation performances. To address this problem, we design a sequential multi-modal information transfer network (SEMI), which utilizes product-domain user behaviors to assist micro-video recommendations. SEMI effectively selects relevant items (i.e., micro-videos and products) with multi-modal features in the micro-video domain and product domain to characterize users' preferences. Moreover, we also propose a cross-domain contrastive learning (CCL) algorithm to pre-train sequence encoders for modeling users' sequential behaviors in these two domains. The objective of CCL is to maximize a lower bound of the mutual information between different domains. We have performed extensive experiments on a large-scale dataset collected from Taobao, a world-leading e-commerce platform. Experimental results show that the proposed method achieves significant improvements over state-of-the-art recommendation methods. Moreover, the proposed method has also been deployed on Taobao, and the online A/B testing results further demonstrate its practical value.
What problem does this paper attempt to address?