Abstract:While Large Language Models (LLMs) have shown impressive capabilities in numerous Natural Language Processing (NLP) tasks, they still struggle with financial question answering (QA), particularly when numerical reasoning is required. Recently, LLM-based multi-agent frameworks have demonstrated remarkable effectiveness in multi-step reasoning, which is crucial for financial QA tasks as it involves extracting relevant information from tables and text and then performing numerical reasoning on the extracted data to infer answers. In this study, we propose a multi-agent framework incorporating a critic agent that reflects on the reasoning steps and final answers for each question. Additionally, we enhance our system by adding multiple critic agents, each focusing on a specific aspect of the answer. Our results indicate that this framework significantly improves performance compared to single-agent reasoning, with an average performance increase of 15% for the LLaMA3-8B model and 5% for the LLaMA3-70B model. Furthermore, our framework performs on par with, and in some cases surpasses, larger single-agent LLMs such as LLaMA3.1-405B and GPT-4o-mini, though it falls slightly short compared to Claude-3.5 Sonnet. Overall, our framework presents an effective solution to enhance open-source LLMs for financial QA tasks, offering a cost-effective alternative to larger models like Claude-3.5 Sonnet.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that in financial question - answering tasks, large language models (LLMs) perform poorly when numerical reasoning is required. Although large language models have demonstrated impressive capabilities in many natural language processing (NLP) tasks, they still face difficulties in financial question - answering (especially tasks involving numerical reasoning). These problems include extracting relevant information from tables and texts, and performing numerical reasoning on the extracted data to infer answers. To address these challenges, the authors propose an approach based on a multi - agent framework, which significantly improves performance by introducing one or more critic agents to reflect on the reasoning steps and the final answer. Specifically, the paper proposes the following points: 1. **Problem Background**: Financial document analysis is crucial for evaluating the performance of enterprises and companies, which requires advanced financial knowledge, the ability to reason across table and text data sources, and the ability to perform complex numerical reasoning. Although existing methods have made some progress, they are often limited by the complexity of data pre - processing and the capabilities of the underlying language models. 2. **Research Motivation**: Although large language models perform well in various reasoning tasks, they still face challenges in complex reasoning tasks. For this reason, researchers have begun to develop multi - agent frameworks based on large language models to perform complex multi - step decision - making and reasoning tasks. However, the effectiveness of these multi - agent systems in financial question - answering tasks has not been fully explored. 3. **Solution**: The authors propose a multi - agent framework for financial question - answering tasks. The framework includes: - **Single - agent Setup**: Only one expert agent is used, which is responsible for extracting data from tables and texts and performing mathematical reasoning. - **Two - agent Setup**: An additional evaluation agent is added to evaluate the response of the expert agent and provide feedback to improve future reasoning. - **Three - agent Setup**: The task is further divided into two subtasks, which are respectively handled by two evaluation agents, each focusing on the evaluation of a specific aspect. 4. **Experimental Results**: Experiments were conducted on three popular financial question - answering benchmark datasets (FinQA, ConFinQA, and TAT - QA). The results show that the multi - agent framework significantly improves performance, especially in tasks involving numerical reasoning. Compared with the single - agent setup, the two - agent and three - agent setups respectively improve performance by an average of 15% and 5%. In addition, in some cases, this framework even outperforms larger single - agent models, such as Claude - 3.5 Sonnet. In conclusion, this paper aims to enhance the numerical reasoning ability of small open - source large language models in financial question - answering tasks by introducing a multi - agent framework, thereby providing a more cost - effective solution.

Enhancing Financial Question Answering with a Multi-Agent Reflection Framework

FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design

FAMMA: A Benchmark for Financial Domain Multilingual Multimodal Question Answering

LONGAGENT: Achieving Question Answering for 128K-Token-long Documents Through Multi-Agent Collaboration

Data-Centric Financial Large Language Models

Equipping Language Models with Tool Use Capability for Tabular Data Analysis in Finance

Fine-tuning Smaller Language Models for Question Answering over Financial Documents

FinLLMs: A Framework for Financial Reasoning Dataset Generation with Large Language Models

Case-Based Reasoning Approach for Solving Financial Question Answering

FinVision: A Multi-Agent Framework for Stock Market Prediction

A Novel DeBERTa-based Model for Financial Question Answering Task

Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering

Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents

MoA is All You Need: Building LLM Research Team using Mixture of Agents

FinQAPT: Empowering Financial Decisions with End-to-End LLM-driven Question Answering Pipeline

Multi-Document Financial Question Answering using LLMs

Enhancing the Efficiency and Accuracy of Underlying Asset Reviews in Structured Finance: The Application of Multi-agent Framework

Financial Knowledge Large Language Model

FinCon: A Synthesized LLM Multi-Agent System with Conceptual Verbal Reinforcement for Enhanced Financial Decision Making

Designing Heterogeneous LLM Agents for Financial Sentiment Analysis

QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental Feedback-based Self-Correction