Clustering and visualization of web search results

Zhao Hua-Jun,Zhong Cai-Ming,Li Wen,Wang Rui-Zhi,Miao Duo-Qian
2010-01-01
Abstract:Nowadays search engines are the most common tools for information retrieval on the internet.However,there are several limitations such as low search coverage and dynamic characteristic of web pages,it is the reason why no breakthrough made on users' searching experience recent years.The leading search engines will return a long list of records that are sorted by the correlation with the queries,the phenomena of synonymy and polysemy make users express their intention difficultly and spend much time on selecting web pages they are interested in.This paper aims at enhancing searching experience using data analysis technologies.Through clustering and visualizing web search results,then grouping the clustering results according to some criterions,it makes users locate their interested information quickly.The data structure related to suffix tree are being widely used in string processing and text compression.The clustering algorithm based on suffix tree which makes it easy to recognize the shared phrases among web pages can be used to cluster web pages,it improves the clustering efficiency as not to calculate the similarities between pair-wise documents,and assigns meaningful labels for the clustering results to enhance the readability,also improves end users' searching experience through visualization.An effective clustering engine prototype system named effective clustering engine has been built on this approach.The algorithm is quite efficient,and the clustering results are readable and accurate verified by the experiments.
What problem does this paper attempt to address?