Multi-Grained Topological Pre-Training of Language Models in Sponsored Search

Zhoujin Tian,Chaozhuo Li,Zhiqiang Zuo,Zengxuan Wen,Xinyue Hu,Xiao Han,Haizhen Huang,Senzhang Wang,Weiwei Deng,Xing Xie,Qi Zhang
DOI: https://doi.org/10.1145/3539618.3592024
2023-01-01
Abstract:Relevance models measure the semantic closeness between queries and the candidate ads, widely recognized as the nucleus of sponsored search systems. Conventional relevance models solely rely on the textual data within the queries and ads, whose performance is hindered by the scarce semantic information in these short texts. Recently, user behavior graphs have been incorporated to provide complementary information beyond pure textual semantics.Despite the promising performance, behavior-enhanced models suffer from exhausting resource costs due to the extra computations introduced by explicit topological aggregations. In this paper, we propose a novel Multi-Grained Topological Pre-Training paradigm, MGTLM, to teach language models to understand multi-grained topological information in behavior graphs, which contributes to eliminating explicit graph aggregations and avoiding information loss. Extensive experimental results over online and offline settings demonstrate the superiority of our proposal.
What problem does this paper attempt to address?