Abstract:In this paper, we conduct a study to utilize LLMs as a solution for decision making that requires complex data analysis. We define Decision QA as the task of answering the best decision, $d_{best}$, for a decision-making question $Q$, business rules $R$ and a database $D$. Since there is no benchmark that can examine Decision QA, we propose Decision QA benchmark, DQA. It has two scenarios, Locating and Building, constructed from two video games (Europa Universalis IV and Victoria 3) that have almost the same goal as Decision QA. To address Decision QA effectively, we also propose a new RAG technique called the iterative plan-then-retrieval augmented generation (PlanRAG). Our PlanRAG-based LM generates the plan for decision making as the first step, and the retriever generates the queries for data analysis as the second step. The proposed method outperforms the state-of-the-art iterative RAG method by 15.8% in the Locating scenario and by 7.4% in the Building scenario, respectively. We release our code and benchmark at <a class="link-external link-https" href="https://github.com/myeon9h/PlanRAG" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The core problem that this paper attempts to solve is: How to use large - language models (LLMs) to provide effective solutions in decision - making that requires complex data analysis. Specifically, the author defines a new task - **Decision QA**, that is, given a decision - making problem, business rules and a database, generate the best decision. In order to evaluate and improve this task, the author proposes a new benchmark test (DQA) and a new technique - **PlanRAG** (Iterative Planning - Retrieval - Augmented Generation technique) to improve the performance of LLMs in decision - making. ### Main Problems and Solutions 1. **Problem Definition**: - **Decision QA**: This is a question - and - answer - style task. The inputs include a structured database $D$, business rules $R$ and a decision - making question $Q$, and the output is the best decision $d_{\text{best}}$. - **DQA Benchmark**: It contains two scenarios - Locating and Building. These scenarios are constructed based on the data of two strategy games (Europa Universalis IV and Victoria 3) and are designed to simulate real - world business decision - making situations. 2. **Limitations of Existing Methods**: - Existing RAG (Retrieval - Augmented Generation) techniques mainly focus on knowledge - based question - and - answer tasks, but perform poorly in handling decision - making tasks, especially in formulating analysis plans (Step 1). 3. **Proposed New Method**: - **PlanRAG**: By introducing a planning step, LLMs can first formulate an analysis plan, then generate queries according to the plan and conduct data retrieval, and finally re - plan or make a decision according to the retrieval results. This process is iterative until no further analysis is required. ### Experimental Results The author verifies the effectiveness of PlanRAG through experiments: - **Performance Improvement**: In the Locating scenario, PlanRAG has a 15.8% improvement compared to the existing state - of - the - art iterative RAG method; in the Building scenario, it has a 7.4% improvement. - **Error Analysis**: PlanRAG significantly reduces errors caused by improper candidates (CAN) and missing data analysis (MIS). - **Importance of Re - planning**: The re - planning step is crucial for improving decision - making accuracy, especially more obvious in the Building scenario. ### Summary The main contributions of this paper are: - Defining a new decision - making task (Decision QA) and proposing the corresponding benchmark test (DQA). - Proposing a new technique (PlanRAG) that significantly improves the performance of LLMs in decision - making tasks. - Verifying the effectiveness of the new method through detailed experiments and analyzing its performance differences in different scenarios. Through these contributions, the author shows the potential of LLMs in complex decision - making and provides valuable references for future research.

PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers

Plan$\times$RAG: Planning-guided Retrieval Augmented Generation

Retrieve-Plan-Generation: An Iterative Planning and Answering Framework for Knowledge-Intensive LLM Generation

Calibrated Decision-Making through LLM-Assisted Retrieval

Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language Models

Retrieval-Augmented Generation for Large Language Models: A Survey

ERATTA: Extreme RAG for Table To Answers with Large Language Models

RuleRAG: Rule-guided retrieval-augmented generation with language models for question answering

Multi-Reranker: Maximizing performance of retrieval-augmented generation in the FinanceRAG challenge

Retriever-and-Memory: Towards Adaptive Note-Enhanced Retrieval-Augmented Generation

A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models

RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation

RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards

Know Your RAG: Dataset Taxonomy and Generation Strategies for Evaluating RAG Systems

RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework

AssistRAG: Boosting the Potential of Large Language Models with an Intelligent Information Assistant

Meta Knowledge for Retrieval Augmented Large Language Models

DRAGIN: Dynamic Retrieval Augmented Generation based on the Information Needs of Large Language Models

Evaluating Retrieval-Augmented Generation Models for Financial Report Question and Answering

Benchmarking Large Language Models in Retrieval-Augmented Generation