Weakly Supervised Co-Training of Query Rewriting Andsemantic Matching for E-Commerce

Rong Xiao,Jianhui Ji,Baoliang Cui,Haihong Tang,Wenwu Ou,Yanghua Xiao,Jiwei Tan,Xuan Ju
DOI: https://doi.org/10.1145/3289600.3291039
2019-01-01
Abstract:Relevance is the core problem of a search engine, and one of the main challenges is the vocabulary gap between user queries and documents. This problem is more serious in e-commerce, because language in product titles is more professional. Query rewriting and semantic matching are two key techniques to bridge the semantic gap between them to improve relevance. Recently, deep neural networks have been successfully applied to the two tasks and enhanced the relevance performance. However, such approaches suffer from the sparseness of training data in e-commerce scenario. In this study, we investigate the instinctive connection between query rewriting and semantic matching tasks, and propose a co-training framework to address the data sparseness problem when training deep neural networks. We first build a huge unlabeled dataset from search logs, on which the two tasks can be considered as two different views of the relevance problem. Then we iteratively co-train them via labeled data generated from this unlabeled set to boost their performance simultaneously. We conduct a series of offline and online experiments on a real-world e-commerce search engine, and the results demonstrate that the proposed method improves relevance significantly.
What problem does this paper attempt to address?