Hybrid Semantic Search: Unveiling User Intent Beyond Keywords

Aman Ahluwalia,Bishwajit Sutradhar,Karishma Ghosh,Indrapal Yadav,Arpan Sheetal,Prashant Patil
2024-09-06
Abstract:This paper addresses the limitations of traditional keyword-based search in understanding user intent and introduces a novel hybrid search approach that leverages the strengths of non-semantic search engines, Large Language Models (LLMs), and embedding models. The proposed system integrates keyword matching, semantic vector embeddings, and LLM-generated structured queries to deliver highly relevant and contextually appropriate search results. By combining these complementary methods, the hybrid approach effectively captures both explicit and implicit user intent.The paper further explores techniques to optimize query execution for faster response times and demonstrates the effectiveness of this hybrid search model in producing comprehensive and accurate search outcomes.
Information Retrieval,Artificial Intelligence
What problem does this paper attempt to address?
This paper attempts to address the limitations of traditional keyword-based search engines in understanding user intent. Specifically, while traditional keyword search methods can identify documents containing the query terms, they often fail to capture the true intent of the user, resulting in low relevance of the search results. To solve this problem, the paper proposes a new hybrid search method that combines the advantages of non-semantic search engines, large language models (LLMs), and embedding models. This hybrid search method provides highly relevant and contextually appropriate search results by integrating keyword matching, semantic vector embeddings, and structured queries generated by LLMs. By combining these complementary approaches, the hybrid search model can effectively capture both explicit and implicit user intents and optimize query execution to speed up response times. Experiments show that this hybrid search model can produce comprehensive and accurate search results across various scenarios.