A two-stage information retrieval system based on interactive multimodal genetic algorithm for query weight optimization
Hao Cong,Wei-Neng Chen,Wei-Jie Yu
DOI: https://doi.org/10.1007/s40747-021-00450-6
2021-07-14
Abstract:Abstract Query weight optimization, which aims to find an optimal combination of the weights of query terms for sorting relevant documents, is an important topic in the information retrieval system. Due to the huge search space, the query optimization problem is intractable, and evolutionary algorithms have become one popular approach. But as the size of the database grows, traditional retrieval approaches may return a lot of results, which leads to low efficiency and poor practicality. To solve this problem, this paper proposes a two-stage information retrieval system based on an interactive multimodal genetic algorithm (IMGA) for a query weight optimization system. The proposed IMGA has two stages: quantity control and quality optimization. In the quantity control stage, a multimodal genetic algorithm with the aid of the niching method selects multiple promising combinations of query terms simultaneously by which the numbers of retrieved documents are controlled in an appropriate range. In the quality optimization stage, an interactive genetic algorithm is designed to find the optimal query weights so that the most user-friendly document retrieval sequence can be yielded. Users’ feedback information will accelerate the optimization process, and a genetic algorithm (GA) performs interactively with the action of relevance feedback mechanism. Replacing user evaluation, a mathematical model is built to evaluate the fitness values of individuals. In the proposed two-stage method, not only the number of returned results can be controlled, but also the quality and accuracy of retrieval can be improved. The proposed method is run on the database which with more than 2000 documents. The experimental results show that our proposed method outperforms several state-of-the-art query weight optimization approaches in terms of the precision rate and the recall rate.
computer science, artificial intelligence