Financial Knowledge Large Language Model

Cehao Yang,Chengjin Xu,Yiyan Qi
2024-06-29
Abstract:Artificial intelligence is making significant strides in the finance industry, revolutionizing how data is processed and interpreted. Among these technologies, large language models (LLMs) have demonstrated substantial potential to transform financial services by automating complex tasks, enhancing customer service, and providing detailed financial analysis. Firstly, we introduce IDEA-FinBench, an evaluation benchmark specifically tailored for assessing financial knowledge in large language models (LLMs). This benchmark utilizes questions from two globally respected and authoritative financial professional exams, aimimg to comprehensively evaluate the capability of LLMs to directly address exam questions pertinent to the finance sector. Secondly, we propose IDEA-FinKER, a Financial Knowledge Enhancement framework designed to facilitate the rapid adaptation of general LLMs to the financial domain, introducing a retrieval-based few-shot learning method for real-time context-level knowledge injection, and a set of high-quality financial knowledge instructions for fine-tuning any general LLM. Finally, we present IDEA-FinQA, a financial question-answering system powered by LLMs. This system is structured around a scheme of real-time knowledge injection and factual enhancement using external knowledge. IDEA-FinQA is comprised of three main modules: the data collector, the data querying module, and LLM-based agents tasked with specific functions.
Computation and Language
What problem does this paper attempt to address?
The paper aims to address the following issues: 1. **Financial Knowledge Benchmark**: Although the current evaluation benchmarks for large language models (LLMs) tend to be comprehensive and complete in general dimensions, they still lack evaluation capabilities in specific fields such as finance. This leads to unresolved questions about whether the currently popular LLMs possess professional skills and knowledge reserves comparable to human financial experts and whether they can effectively handle automated tasks in the financial industry. 2. **Enhancement with Financial Knowledge**: Adapting LLMs to specific domains (such as finance) presents significant challenges. Although efforts have been made to enhance the base models through further pre-training and fine-tuning with financial text and instruction datasets, these attempts have not achieved the desired results and, in some cases, have even led to performance degradation. Therefore, how to effectively integrate financial knowledge into LLMs through methods such as contextual learning or supervised fine-tuning still requires further research and development. To address these issues, the paper proposes the following main contributions: - **IDEA-FinBench**: A new evaluation benchmark designed to assess the knowledge level of LLMs in the financial field by utilizing questions from two internationally renowned and authoritative financial professional exams. This benchmark covers both Chinese and English languages, four different question formats, and spans 16 financial subject areas to comprehensively evaluate the ability of LLMs to directly tackle finance-related exam questions. - **IDEA-FinKER**: A financial knowledge enhancement framework aimed at facilitating the rapid adaptation of general LLMs to the financial domain, reducing the need for external pre-training. IDEA-FinKER is based on a carefully curated Chinese financial exam question bank, supports real-time contextual knowledge injection, and introduces a set of high-quality financial knowledge instructions for fine-tuning general LLMs. - **IDEA-FinQA**: A dynamic financial question-answering system driven by LLMs, utilizing external knowledge bases for real-time knowledge injection and fact enhancement. This system comprises three main modules: a data collector, a data query module, and four specialized LLM agents, each responsible for specific tasks, thereby improving the system's effectiveness.