Optimization for Collecting and Scoring Documents for Complex Boolean Query

Da HUANG,Hongfei YAN
DOI: https://doi.org/10.3778/j.issn.1673-9418.1511044
2017-01-01
Abstract:Although Boolean query has been proposed very early in information retrieval,most research on Boolean query focuses on homogeneous Boolean operation.Few researchers paid attention to complex Boolean query,while such query is used more and more frequently,e.g.in text-based recommendation.In order to make complex Boolean query execute more efficiently,this paper proposes a new strategy,DCQ (DAAT for complex query) algorithm,which is based on DAAT (document-at-a-time) framework.By comparing DCQ algorithm with the well-known open-source search engine,Lucene,it shows a promising improvement on performance.Besides,this paper proposes a method for performance regression,which can decide when to use DCQ algorithm accurately.Experiments show that the compound algorithm with performance regression is much better than the algorithm for collecting and scoring documents used in Lucene.
What problem does this paper attempt to address?