Enhanced Query Classification with Millions of Fine-Grained Topics.

Qi Ye,Feng Wang,Bo Li,Zhimin Liu
DOI: https://doi.org/10.1007/978-3-319-39958-4_10
2016-01-01
Abstract:Query classification is a crucial task to understand user search intents. Although this problem has been well studied in the past decades, it is still a big challenge in real-world applications due to the sparse, noisy and ambiguous nature of queries. In this paper, we present another important issue called "the pomegranate phenomenon". This phenomenon is named for the gap between manually manageable small taxonomy and massive coherent topics in each category. Furthermore, the fine-grained topics in the same category of the taxonomy may be textually more relevant to the topics in other categories. This phenomenon will hurt the performances of most traditional classification methods. To overcome this problem, we present a practical approach to enhance the performances of traditional query classifiers. First, we detect millions of fine-grained query topics from two years of click logs which can represent different query intents and give them category labels. Second, for a given query, we calculate the K most relevant topics and select the label by majority voting, then try to use this label to improve the results of classical query classification methods. Empirical evaluation confirms that our topic based classification algorithms can significantly enhance the performances of traditional classifiers in read-world query classification tasks.
What problem does this paper attempt to address?