Technology Report of HIT-IRLab for Evaluation 2005 of 863 Information Retrieval
Zhi-chang ZHANG,Yu ZHANG,Li-qi GAO,Xin-cheng YUAN,Xiao-guang HU,Ting LIU,Sheng Li
DOI: https://doi.org/10.3969/j.issn.1003-0077.2006.z1.015
2006-01-01
Abstract:A rough set of relevant results is returned by Lucene, which based on vector space model, after searching all web pages, and is then reranked by Lemur, a language model based tool, to form a second set of relevant results. These two sets are combined by a linear interpolation into one set afterward and the top 1000 pages in it are returned as final results. When formulating queries from topics, key words of queries are selected from fields and fileds of topics, and weights of them are calculated using a modified tf*idf method. In the official evaluation on 50 topics, MAP 0.3107, P@10 0.624, R-Precision 0.3672 and MAP 0.3538, P@10 0.684, R-Precision 0.4078 are achieved with queries constructed automatically and artificially respectively.