Adjacency Matrix Based Full-Text Indexing Models

Shuigeng Zhou,Jihong Guan,Yunfa Hu,Jiangtao Hu,Aoying Zhou
DOI: https://doi.org/10.1007/3-540-47714-4_6
2002-01-01
Journal of Software
Abstract:This paper proposes two new character-based full-text indexing models, i.e., adjacency matrix based inverted file and adjacency matrix based PAT array. Formally, the former is a kind of reorganization of the traditional inverted file, and the latter is a kind of decomposition of the traditional PAT array. Both organize text-indexing information in the form of adjacency matrix. Query algorithms for the new models are developed and performance comparisons between the new models and the traditional models are carried out. The new models can improve query-processing efficiency considerably at the cost of much less amount of extra storage overhead compared to the size of original text database, so are suitable for applications of large-scale text databases, especially Chinese text databases.
What problem does this paper attempt to address?