Identifying queries in the wild, wild web.

Jingjing Liu,Chang Liu,Jun Zhang,Ralf Bierig,Michael J. Cole
DOI: https://doi.org/10.1145/1840784.1840832
2010-01-01
Abstract:Identifying user querying behavior is an important problem for information seeking and retrieval research. Query-related studies typically rely on server-side logs taken from a single search engine, but a comprehensive view of user querying behaviors requires analysis of data collected from the client-side for unrestricted searches. We developed three methods to identify querying behaviors and tested them on client-side logs collected in a lab experiment for realistic tasks and unrestricted searches on the entire Web. Results show that the best method was able to identify 97% of queries issued, with a precision of 92%. Although based on a relatively small number of search episodes, our methods, perhaps with minimal modifications, should be adequate for identification of queries in logs of unconstrained Web search.
What problem does this paper attempt to address?