SNFinLLM: Systematic and Nuanced Financial Domain Adaptation of Chinese Large Language Models

Shujuan Zhao,Lingfeng Qiao,Kangyang Luo,Qian-Wen Zhang,Junru Lu,Di Yin
2024-08-05
Abstract:Large language models (LLMs) have become powerful tools for advancing natural language processing applications in the financial industry. However, existing financial LLMs often face challenges such as hallucinations or superficial parameter training, resulting in suboptimal performance, particularly in financial computing and machine reading comprehension (MRC). To address these issues, we propose a novel large language model specifically designed for the Chinese financial domain, named SNFinLLM. SNFinLLM excels in domain-specific tasks such as answering questions, summarizing financial research reports, analyzing sentiment, and executing financial calculations. We then perform the supervised fine-tuning (SFT) to enhance the model's proficiency across various financial domains. Specifically, we gather extensive financial data and create a high-quality instruction dataset composed of news articles, professional papers, and research reports of finance domain. Utilizing both domain-specific and general datasets, we proceed with continuous pre-training on an established open-source base model, resulting in SNFinLLM-base. Following this, we engage in supervised fine-tuning (SFT) to bolster the model's capability across multiple financial tasks. Crucially, we employ a straightforward Direct Preference Optimization (DPO) method to better align the model with human preferences. Extensive experiments conducted on finance benchmarks and our evaluation dataset demonstrate that SNFinLLM markedly outperforms other state-of-the-art financial language models. For more details, check out our demo video here: <a class="link-external link-https" href="https://www.youtube.com/watch?v=GYT-65HZwus" rel="external noopener nofollow">this https URL</a>.
Computation and Language
What problem does this paper attempt to address?
The paper aims to address issues such as hallucinations and insufficient parameter training in large language models (LLMs) within the financial domain, which lead to poor performance in tasks like financial computation and machine reading comprehension (MRC). To solve these problems, the authors propose a novel large language model specifically for the Chinese financial domain—SNFinLLM. The model enhances performance through the following methods: 1. **Data Collection and Processing**: Collect a large amount of data from the financial domain, including news articles, professional papers, and research reports, and create a high-quality instruction dataset. 2. **Continual Pre-training**: Conduct continual pre-training on an open-source base model, incorporating data from the financial domain. 3. **Supervised Fine-Tuning (SFT)**: Use domain-specific data for supervised fine-tuning to enhance the model's capabilities in various financial tasks. 4. **Direct Preference Optimization (DPO)**: Employ a simple yet effective direct preference optimization method to make the model more aligned with human cognitive patterns. 5. **Calculator Tool Integration**: Introduce computational tasks by activating the Python interpreter to ensure accurate calculations. Through these methods, SNFinLLM performs excellently in various tasks within the financial domain, particularly in financial computation and machine reading comprehension. Experimental results show that SNFinLLM significantly outperforms other state-of-the-art financial language models in multiple benchmark tests.