How Well Do Offline and Online Evaluation Metrics Measure User Satisfaction in Web Image Search?

Fan Zhang,Ke Zhou,Yunqiu Shao,Cheng Luo,Min Zhang,Shaoping Ma
DOI: https://doi.org/10.1145/3209978.3210059
2018-01-01
Abstract:Comparing to general Web search engines, image search engines present search results differently, with two-dimensional visual image panel for users to scroll and browse quickly. These differences in result presentation can significantly impact the way that users interact with search engines, and therefore affect existing methods of search evaluation. Although different evaluation metrics have been thoroughly studied in the general Web search environment, how those offline and online metrics reflect user satisfaction in the context of image search is an open question. To shed light on this, we conduct a laboratory user study that collects both explicit user satisfaction feedbacks as well as user behavior signals such as clicks. Based on the combination of both externally assessed topical relevance and image quality judgments, offline image search metrics can be better correlated with user satisfaction than merely using topical relevance. We also demonstrate that existing offline Web search metrics can be adapted to evaluate on a two-dimensional presentation for image search. With respect to online metrics, we find that those based on image click information significantly outperform offline metrics. To our knowledge, our work is the first to thoroughly establish the relationship between different measures and user satisfaction in image search.
What problem does this paper attempt to address?