Ancient Wisdom, Modern Tools: Exploring Retrieval-Augmented LLMs for Ancient Indian Philosophy

Priyanka Mandikal
2024-08-24
Abstract:LLMs have revolutionized the landscape of information retrieval and knowledge dissemination. However, their application in specialized areas is often hindered by factual inaccuracies and hallucinations, especially in long-tail knowledge distributions. We explore the potential of retrieval-augmented generation (RAG) models for long-form question answering (LFQA) in a specialized knowledge domain. We present VedantaNY-10M, a dataset curated from extensive public discourses on the ancient Indian philosophy of Advaita Vedanta. We develop and benchmark a RAG model against a standard, non-RAG LLM, focusing on transcription, retrieval, and generation performance. Human evaluations by computational linguists and domain experts show that the RAG model significantly outperforms the standard model in producing factual and comprehensive responses having fewer hallucinations. In addition, a keyword-based hybrid retriever that emphasizes unique low-frequency terms further improves results. Our study provides insights into effectively integrating modern large language models with ancient knowledge systems. Project page with dataset and code: <a class="link-external link-https" href="https://sites.google.com/view/vedantany-10m" rel="external noopener nofollow">this https URL</a>
Computation and Language,Computers and Society,Information Retrieval
What problem does this paper attempt to address?
The main problem this paper attempts to address is the factual inaccuracies and hallucination issues that large language models (LLMs) encounter when dealing with long-tail knowledge distributions in specific domains. Specifically, the authors focus on the domain of Advaita Vedanta in ancient Indian philosophy. The paper proposes a Retrieval-Augmented Generation (RAG) model aimed at improving the factual accuracy and reducing hallucinations of LLMs when generating long-form question answering (LFQA), particularly in niche knowledge areas like Advaita Vedanta, by integrating external data storage. To achieve this goal, the authors constructed a dataset named VedantaNY-10M, which includes over 750 hours of publicly available lecture content sourced from YouTube, focusing on Vedanta philosophy. Using this dataset, the authors developed and evaluated the performance of the RAG model compared to standard non-RAG LLMs in terms of transcription, retrieval, and generation. The study results indicate that the RAG model significantly outperforms standard models in generating more factual, comprehensive, and less hallucinatory responses. Additionally, the authors proposed a keyword-based hybrid retriever that further enhances the performance of the RAG model, especially when dealing with low-frequency or domain-specific terms.