Abstract:Full-text search engines are important tools for information retrieval. In a proximity full-text search, a document is relevant if it contains query terms near each other, especially if the query terms are frequently occurring words. For each word in a text, we use additional indexes to store information about nearby words that are at distances from the given word of less than or equal to the MaxDistance parameter. We showed that additional indexes with three-component keys can be used to improve the average query execution time by up to 94.7 times if the queries consist of high-frequency occurring words. In this paper, we present a new search algorithm with even more performance gains. We consider several strategies for selecting multi-component key indexes for a specific query and compare these strategies with the optimal strategy. We also present the results of search experiments, which show that three-component key indexes enable much faster searches in comparison with two-component key indexes. This is a pre-print of a contribution "Veretennikov A.B. (2019) Proximity Full-Text Search by Means of Additional Indexes with Multi-component Keys: In Pursuit of Optimal Performance." published in "Manolopoulos Y., Stupnikov S. (eds) Data Analytics and Management in Data Intensive Domains. DAMDID/RCDL 2018. Communications in Computer and Information Science, vol 1003" published by Springer, Cham. This book constitutes the refereed proceedings of the 20th International Conference on Data Analytics and Management in Data Intensive Domains, DAMDID/RCDL 2018, held in Moscow, Russia, in October 2018. The 9 revised full papers presented together with three invited papers were carefully reviewed and selected from 54 submissions. The final authenticated version is available online at <a class="link-external link-https" href="https://doi.org/10.1007/978-3-030-23584-0_7" rel="external noopener nofollow">this https URL</a>.

Selection of Optimal Parameters in the Fast K-Word Proximity Search Based on Multi-component Key Indexes

Proximity Full-Text Search by Means of Additional Indexes with Multi-component Keys: In Pursuit of Optimal Performance

Proximity full-text searches of frequently occurring words with a response time guarantee

Relevance ranking for proximity full-text search based on additional indexes with multi-component keys

Proximity Full-Text Search with a Response Time Guarantee by Means of Additional Indexes

An efficient algorithm for three-component key index construction

Processing Spatial Keyword Query As a Top-K Aggregation Query

Scalable Top-K Spatial Keyword Search

Processing Long Queries Against Short Text

Optimizing top-k retrieval: submodularity analysis and search strategies

Keyword-based k-nearest neighbor search in spatial databases.

Towards an Optimal Space-and-Query-Time Index for Top-k Document Retrieval

Efficient Top K Temporal Spatial Keyword Search

Efficient Algorithms for Top-k Keyword Queries on Spatial Databases

Efficient Bulk Loading to Accelerate Spatial Keyword Queries

Augmented Keyword Search on Spatial Entity Databases

Keyword Search in Spatial Databases: Towards Searching by Document

Efficient multi-keyword search over p2p web.

LexBoost: Improving Lexical Document Retrieval with Nearest Neighbors

Using Additional Indexes for Fast Full-Text Search of Phrases That Contain Frequently Used Words

Keyword Proximity Search over Large and Complex RDF Database.