AdsCVLR: Commercial Visual-Linguistic Representation Modeling in Sponsored Search
Yongjie Zhu,Chunhui Han,Yuefeng Zhan,Bochen Pang,Zhaoju Li,Hao Sun,Si Li,Boxin Shi,Nan Duan,Weiwei Deng,Ruofei Zhang,Liangjie Zhang,Qi Zhang
DOI: https://doi.org/10.1145/3503161.3548226
2022-01-01
Abstract:Sponsored search advertisements (ads) appear next to search results when consumers look for products and services on search engines. As the fundamental basis of search ads, relevance modeling has attracted increasing attention due to the significant research challenges and tremendous practical value. In this paper, we address the problem of multi-modal modeling in sponsored search, which models the relevance between user query and commercial ads with multi-modal structured information. To solve this problem, we propose a transformer architecture with Ads data on Commercial Visual-Linguistic Representation (AdsCVLR) with contrastive learning that naturally extends the transformer encoder with the complementary multi-modal inputs, serving as a strong aggregator of image-text features. We also make a public advertising dataset, which includes 480K labeled query-ad pairwise data with structured information of image, title, seller, description, and so on. Empirically, we evaluate the AdsCVLR model over the large industry dataset, and the experimental results of online/offline tests show the superiority of our method.