Multi Page Search with Reinforcement Learning to Rank

Wei Zeng,Jun Xu,Yanyan Lan,Jiafeng Guo,Xueqi Cheng
DOI: https://doi.org/10.1145/3234944.3234977
2018-01-01
Abstract:Web search engines are typically designed to involve multiple pages of search results, and the search users engaging in exploratory search with ad hoc queries are likely to access more than one result pages. The ranking of web pages for such queries should consider additional information other than the original query, e.g., the user clicks on previous result pages. Existing methods that utilize this kind of information usually involve relevance feedback, which uses the feedback information to explore the user's intent. However, due to the limitation of the feedback mechanism, it is difficult to apply existing relevance feedback techniques to state-of-the-art learning to rank models. In this paper, we propose a novel learning to rank model for multi page search in which the user's feedback can be naturally utilized for improving the ranking of next result page. The model, referred to as MDP-MPS, formalizes the ranking of documents in multi page search as a Markov decision process (MDP) in which the search engine corresponds to the agent for constructing the document rankings in the result pages, and the user corresponds to the environment for judging the rankings and providing rewards. The policy gradient algorithm of REINFORCE is adopted for learning the model parameters. Experimental results on OHSUMED dataset showed that our approach outperformed the baselines of traditional relevance ranking model of ListNet and relevance feedback method of Rocchio.
What problem does this paper attempt to address?