Context Matters: Pushing the Boundaries of Open-Ended Answer Generation with Graph-Structured Knowledge Context

Somnath Banerjee,Amruit Sahoo,Sayan Layek,Avik Dutta,Rima Hazra,Animesh Mukherjee
2024-10-16
Abstract:In the continuously advancing AI landscape, crafting context-rich and meaningful responses via Large Language Models (LLMs) is essential. Researchers are becoming more aware of the challenges that LLMs with fewer parameters encounter when trying to provide suitable answers to open-ended questions. To address these hurdles, the integration of cutting-edge strategies, augmentation of rich external domain knowledge to LLMs, offers significant improvements. This paper introduces a novel framework that combines graph-driven context retrieval in conjunction to knowledge graphs based enhancement, honing the proficiency of LLMs, especially in domain specific community question answering platforms like AskUbuntu, Unix, and ServerFault. We conduct experiments on various LLMs with different parameter sizes to evaluate their ability to ground knowledge and determine factual accuracy in answers to open-ended questions. Our methodology GraphContextGen consistently outperforms dominant text-based retrieval systems, demonstrating its robustness and adaptability to a larger number of use cases. This advancement highlights the importance of pairing context rich data retrieval with LLMs, offering a renewed approach to knowledge sourcing and generation in AI systems. We also show that, due to rich contextual data retrieval, the crucial entities, along with the generated answer, remain factually coherent with the gold answer.
Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to generate high - quality and factually accurate answers in community question - answering platforms. Specifically, the paper focuses on the following aspects: 1. **Limitations of the knowledge base**: Although large language models (LLMs) perform well in text understanding and generation, they perform poorly in resource - limited environments, are restricted by the knowledge cutoff date, and are prone to hallucinations (i.e., generating untrue information). 2. **Limitations of text retrieval**: Traditional text - based retrieval methods are not effective in handling complex questions. They often rely on simple keyword matching and are difficult to capture deep - seated semantic relationships, resulting in less accurate or relevant results. 3. **Factual accuracy of generated answers**: On community question - answering platforms, the generated answers need to maintain factual coherence and accuracy with the actual answers. Existing methods face challenges in this regard, especially in domain - specific question - answering. To overcome these limitations, the paper proposes a new framework - GRAPH CONTEXT GEN, which combines a graph - based retrieval system and knowledge graph enhancement techniques to improve the context richness and factual accuracy of LLMs when generating answers. Through this method, the paper aims to enhance the performance of LLMs on community question - answering platforms, especially in low - resource areas.