Improving Web Search Ranking by Incorporating Summarization

Xian-Jun Meng,Qing-Cai Chen,Xiao-Long Wang,Xiao-Hong Yang
DOI: https://doi.org/10.1109/icsmc.2007.4414122
2007-01-01
Abstract:Though link analysis based page ranking approaches have reached great success in commercial search engines (SE), the content based relevance computing approaches also play a very important role in the ranking of information retrieval results. Since most of existing relevance computing algorithms are running on the full text of a web page, this paper is focused on the relevance computing between user's query and the auto-generated text summarization of each webpage. The first part of this paper provides a brief introduction of the state of art of relevance computing in SE. The inference network approach is especially concerned in this paper since it is the baseline method in our experiment SE system. Then the auto text summarization method based on multi-source integration is introduced, and the full text of each web page is replaced by its auto-generated abstract to compute the relevance between the webpage and user query. To evaluate the effect of the condensation representation of full text on the relevance based page rank of a system, several experiments are conducted in the last part of this paper, which include the method remarked above with different compress ratio, and the full text based ranking. In addition to the efficiency gain of the SE system, the experiment results also shows that the ranking results based on the summary generated by our text summarization system with 30% compress ratio can also get 11.29% of the precision improvement for the SE system.
What problem does this paper attempt to address?