Enhancing Financial Question Answering with a Multi-Agent Reflection Framework

Sorouralsadat Fatemi,Yuheng Hu
DOI: https://doi.org/10.1145/3677052.3698686
2024-10-29
Abstract:While Large Language Models (LLMs) have shown impressive capabilities in numerous Natural Language Processing (NLP) tasks, they still struggle with financial question answering (QA), particularly when numerical reasoning is required. Recently, LLM-based multi-agent frameworks have demonstrated remarkable effectiveness in multi-step reasoning, which is crucial for financial QA tasks as it involves extracting relevant information from tables and text and then performing numerical reasoning on the extracted data to infer answers. In this study, we propose a multi-agent framework incorporating a critic agent that reflects on the reasoning steps and final answers for each question. Additionally, we enhance our system by adding multiple critic agents, each focusing on a specific aspect of the answer. Our results indicate that this framework significantly improves performance compared to single-agent reasoning, with an average performance increase of 15% for the LLaMA3-8B model and 5% for the LLaMA3-70B model. Furthermore, our framework performs on par with, and in some cases surpasses, larger single-agent LLMs such as LLaMA3.1-405B and GPT-4o-mini, though it falls slightly short compared to Claude-3.5 Sonnet. Overall, our framework presents an effective solution to enhance open-source LLMs for financial QA tasks, offering a cost-effective alternative to larger models like Claude-3.5 Sonnet.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in financial question - answering tasks, large language models (LLMs) perform poorly when numerical reasoning is required. Although large language models have demonstrated impressive capabilities in many natural language processing (NLP) tasks, they still face difficulties in financial question - answering (especially tasks involving numerical reasoning). These problems include extracting relevant information from tables and texts, and performing numerical reasoning on the extracted data to infer answers. To address these challenges, the authors propose an approach based on a multi - agent framework, which significantly improves performance by introducing one or more critic agents to reflect on the reasoning steps and the final answer. Specifically, the paper proposes the following points: 1. **Problem Background**: Financial document analysis is crucial for evaluating the performance of enterprises and companies, which requires advanced financial knowledge, the ability to reason across table and text data sources, and the ability to perform complex numerical reasoning. Although existing methods have made some progress, they are often limited by the complexity of data pre - processing and the capabilities of the underlying language models. 2. **Research Motivation**: Although large language models perform well in various reasoning tasks, they still face challenges in complex reasoning tasks. For this reason, researchers have begun to develop multi - agent frameworks based on large language models to perform complex multi - step decision - making and reasoning tasks. However, the effectiveness of these multi - agent systems in financial question - answering tasks has not been fully explored. 3. **Solution**: The authors propose a multi - agent framework for financial question - answering tasks. The framework includes: - **Single - agent Setup**: Only one expert agent is used, which is responsible for extracting data from tables and texts and performing mathematical reasoning. - **Two - agent Setup**: An additional evaluation agent is added to evaluate the response of the expert agent and provide feedback to improve future reasoning. - **Three - agent Setup**: The task is further divided into two subtasks, which are respectively handled by two evaluation agents, each focusing on the evaluation of a specific aspect. 4. **Experimental Results**: Experiments were conducted on three popular financial question - answering benchmark datasets (FinQA, ConFinQA, and TAT - QA). The results show that the multi - agent framework significantly improves performance, especially in tasks involving numerical reasoning. Compared with the single - agent setup, the two - agent and three - agent setups respectively improve performance by an average of 15% and 5%. In addition, in some cases, this framework even outperforms larger single - agent models, such as Claude - 3.5 Sonnet. In conclusion, this paper aims to enhance the numerical reasoning ability of small open - source large language models in financial question - answering tasks by introducing a multi - agent framework, thereby providing a more cost - effective solution.