Context-Aware Query Rewriting for Improving Users' Search Experience on E-commerce Websites

Simiao Zuo,Qingyu Yin,Haoming Jiang,Shaohui Xi,Bing Yin,Chao Zhang,Tuo Zhao
DOI: https://doi.org/10.48550/arXiv.2209.07584
2022-09-24
Abstract:E-commerce queries are often short and ambiguous. Consequently, query understanding often uses query rewriting to disambiguate user-input queries. While using e-commerce search tools, users tend to enter multiple searches, which we call context, before purchasing. These history searches contain contextual insights about users' true shopping intents. Therefore, modeling such contextual information is critical to a better query rewriting model. However, existing query rewriting models ignore users' history behaviors and consider only the instant search query, which is often a short string offering limited information about the true shopping intent. We propose an end-to-end context-aware query rewriting model to bridge this gap, which takes the search context into account. Specifically, our model builds a session graph using the history search queries and their contained words. We then employ a graph attention mechanism that models cross-query relations and computes contextual information of the session. The model subsequently calculates session representations by combining the contextual information with the instant search query using an aggregation network. The session representations are then decoded to generate rewritten queries. Empirically, we demonstrate the superiority of our method to state-of-the-art approaches under various metrics. On in-house data from an online shopping platform, by introducing contextual information, our model achieves 11.6% improvement under the MRR (Mean Reciprocal Rank) metric and 20.1% improvement under the HIT@16 metric (a hit rate metric), in comparison with the best baseline method (Transformer-based model).
Information Retrieval,Machine Learning
What problem does this paper attempt to address?
This paper attempts to solve the problem of query rewriting in e - commerce search. Specifically, since the queries entered by users are usually short and ambiguous, it is difficult for search engines to accurately understand the actual shopping intentions of users, thus returning irrelevant products and affecting the user experience. The paper proposes an end - to - end situation - aware query - rewriting model to solve this problem by using users' historical search records. This model can capture users' context information and improve the accuracy of queries and users' search experience. ### Main contributions of the paper 1. **Situation - aware query - rewriting model**: The paper proposes a new query - rewriting method. This method not only considers the current query, but also combines users' historical search records. By constructing a session graph and using the graph attention mechanism to capture cross - query relationships, it can more accurately understand users' shopping intentions. 2. **Improved performance**: The experimental results show that, compared with the existing state - of - the - art methods, this model has a significant improvement in multiple evaluation metrics. For example, on the MRR (Mean Reciprocal Rank) metric, this model is 11.6% higher than the best baseline method (the Transformer - based model), and on the HIT@16 metric, it is 20.1% higher. 3. **End - to - end learning framework**: This model adopts an end - to - end sequence - to - sequence (seq2seq) learning framework, which avoids the complex feature engineering in traditional methods and the need for a large amount of manually - labeled data, making the model more efficient and easier to train. ### Specific technical details of the model 1. **Transformer encoder**: The model first uses the Transformer encoder to encode the current query and historical queries, generating a representation vector for each query. 2. **Session graph construction**: Based on users' search histories, a session graph is constructed, where nodes include queries and words in the queries. Through the graph attention mechanism, the model can capture the relationships between different queries, so as to better understand users' intentions. 3. **Aggregation network**: Through the aggregation network, the model combines the information of historical queries with that of the current query to generate the final session representation. This session representation contains users' context information and is helpful for generating more accurate rewritten queries. 4. **Transformer decoder**: Finally, the model uses the Transformer decoder to generate the rewritten query. The decoder generates multiple candidate queries according to the session representation and selects the most appropriate query as the final output. ### Experimental verification The paper conducted experiments on the internal data set of an online shopping platform to verify the effectiveness of the model. The experimental results show that this model is significantly superior to the existing baseline methods in multiple evaluation metrics, especially in the MRR and HIT@16 metrics. ### Conclusion The situation - aware query - rewriting model proposed in this paper effectively solves the problem of query ambiguity in e - commerce search. By using users' historical search records, it improves the accuracy of queries and users' search experience. The end - to - end learning framework and graph attention mechanism of this model provide new ideas for future research.