Part-of-Speech Sense Matrix Model Experiments in the TREC 2004 Robust Track at ICL, PKU.

Bing Swen,Xue-qiang Lü,Hongying Zan,Qi Su,Zhi-guo Lai,Kun Xiang,Jing-he Hu
DOI: https://doi.org/10.6028/nist.sp.500-261.robust-peking.u
2004-01-01
Abstract:The Robust Retrieval track is a traditional ad hoc retrieval task with the focus on individual topic effectiveness. This track provides us an opportunity to do experiments on our recently proposed IR model using a word-by-sense matrix document representation, which was called Sense Matrix Model (SMM) [Swen 2003, 2004]. For the first time to extensively test the model, some simpler and easy-toimplement forms of SMM is used for this year’s Robust track, where the part-of-speeches of words are treated as the (rough) senses of words. Though the model supports several matrix similarity measures and some advanced data analysis techniques, our initial implementation can only handle sense sets at the scale of a few hundreds of senses. Thus a relatively small part-of-speech tag set is employed and only two different matrix similarity measures used.
What problem does this paper attempt to address?