ERATTA: Extreme RAG for Table To Answers with Large Language Models

Sohini Roychowdhury,Marko Krema,Anvar Mahammad,Brian Moore,Arijit Mukherjee,Punit Prakashchandra

2024-09-02

Abstract:Large language models (LLMs) with retrieval augmented-generation (RAG) have been the optimal choice for scalable generative AI solutions in the recent past. Although RAG implemented with AI agents (agentic-RAG) has been recently popularized, its suffers from unstable cost and unreliable performances for Enterprise-level data-practices. Most existing use-cases that incorporate RAG with LLMs have been either generic or extremely domain specific, thereby questioning the scalability and generalizability of RAG-LLM approaches. In this work, we propose a unique LLM-based system where multiple LLMs can be invoked to enable data authentication, user-query routing, data-retrieval and custom prompting for question-answering capabilities from Enterprise-data tables. The source tables here are highly fluctuating and large in size and the proposed framework enables structured responses in under 10 seconds per query. Additionally, we propose a five metric scoring module that detects and reports hallucinations in the LLM responses. Our proposed system and scoring metrics achieve >90% confidence scores across hundreds of user queries in the sustainability, financial health and social media domains. Extensions to the proposed extreme RAG architectures can enable heterogeneous source querying using LLMs.

Artificial Intelligence,Machine Learning

What problem does this paper attempt to address?

The main problem this paper attempts to address is the cost instability and performance unreliability of existing large language model (LLM)-based and retrieval-augmented generation (RAG) techniques in enterprise-level data practices. Specifically, although methods combining RAG with agents (agentic-RAG) can improve the knowledge quality of retrieved content, these methods face issues such as high costs, time consumption, and difficulty in meeting the needs of a large number of users or groups when dealing with large-scale, highly volatile enterprise data tables. Additionally, existing RAG-LLM methods are either too generic or extremely specific to a particular domain, which raises questions about their scalability and generality. To this end, the paper proposes a unique multi-LLM system framework that leverages multiple large language models to achieve data authentication, user query routing, data retrieval, and custom prompts to support the ability to obtain answers from enterprise data tables. This framework aims to address the following issues: 1. **Improve response speed**: Ensure structured responses are completed within 10 seconds for each query. 2. **Reduce hallucinations**: Propose a five-metric scoring module to detect and report hallucinations in LLM responses. 3. **Enhance scalability and generality**: Improve system scalability and efficiency by decomposing the RAG process into specific tasks (i.e., extreme RAG), thereby reducing maintenance and operational costs. 4. **Support heterogeneous source queries**: Extend the extreme RAG architecture to support heterogeneous source queries using LLMs. Overall, the paper aims to provide faster, more accurate, more reliable, and more cost-effective question-and-answer solutions for enterprise-level data tables through an improved RAG method.

ERATTA: Extreme RAG for Table To Answers with Large Language Models

RAG based Question-Answering for Contextual Response Prediction System

Deploying Large Language Models With Retrieval Augmented Generation

Retrieval-Augmented Generation for Large Language Models: A Survey

T-RAG: Lessons from the LLM Trenches

Know Your RAG: Dataset Taxonomy and Generation Strategies for Evaluating RAG Systems

Meta Knowledge for Retrieval Augmented Large Language Models

Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation

A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models

Learning When to Retrieve, What to Rewrite, and How to Respond in Conversational QA

TableRAG: Million-Token Table Understanding with Language Models

Towards Optimizing a Retrieval Augmented Generation using Large Language Model on Academic Data

Question-Based Retrieval using Atomic Units for Enterprise RAG

Improving Retrieval for RAG based Question Answering Models on Financial Documents

Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language Models

AssistRAG: Boosting the Potential of Large Language Models with an Intelligent Information Assistant

Retrieval Augmented Generation (RAG) and Beyond: A Comprehensive Survey on How to Make your LLMs use External Data More Wisely

Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models

Satyrn: A Platform for Analytics Augmented Generation

RAG Does Not Work for Enterprises

SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains