CryptoGPT: a 7B model rivaling GPT-4 in the task of analyzing and classifying real-time financial news

Ying Zhang,Matthieu Petit Guillaume,Aurélien Krauth,Manel Labidi
2024-06-20
Abstract:CryptoGPT: a 7B model competing with GPT-4 in a specific task -- The Impact of Automatic Annotation and Strategic Fine-Tuning via QLoRAIn this article, we present a method aimed at refining a dedicated LLM of reasonable quality with limited resources in an industrial setting via CryptoGPT. It is an LLM designed for financial news analysis for the cryptocurrency market in real-time. This project was launched in an industrial context. This model allows not only for the classification of financial information but also for providing comprehensive analysis. We refined different LLMs of the same size such as Mistral-7B and LLama-7B using semi-automatic annotation and compared them with various LLMs such as GPT-3.5 and GPT-4. Our goal is to find a balance among several needs: 1. Protecting data (by avoiding their transfer to external servers), 2. Limiting annotation cost and time, 3. Controlling the model's size (to manage deployment costs), and 4. Maintaining better analysis quality.
Artificial Intelligence,Computational Engineering, Finance, and Science,Computation and Language,Neural and Evolutionary Computing
What problem does this paper attempt to address?
### Problems the paper attempts to solve The paper aims to solve the following problems: 1. **Real - time analysis and classification of financial news**: - The paper focuses on how to use large language models (LLMs) to analyze and classify financial market news in a real - time environment, especially for the cryptocurrency market. Due to the high volatility of the cryptocurrency market, traditional financial models are often difficult to accurately predict market dynamics. - Researchers hope to develop a more efficient and lower - cost model that can capture important events affecting the market in real - time data and provide timely risk warnings. 2. **Model optimization under resource constraints**: - Although large language models (such as GPT - 4) are powerful in performance, their training and deployment require a large amount of computing resources and high costs, which makes it difficult for many small and medium - sized enterprises (SMEs) to afford. - For this reason, researchers explore how to fine - tune smaller pre - trained models (such as LLaMa - 2 with 7B parameters and Mistral - 7B) and achieve performance comparable to large models under limited resources. 3. **Data labeling and model training**: - The lack of high - quality financial sentiment analysis data sets is another challenge in this field. Researchers propose an automated labeling method, combining multiple large language models and manual verification, to construct a data set containing multiple financial categories. - Through this method, researchers can generate a large amount of labeled data for model training and evaluation. 4. **Model performance evaluation**: - In order to verify the effectiveness of the model, researchers design a set of evaluation methods, including intrinsic evaluation and expert review. By comparing the performance of different models, researchers hope to prove that their model can be comparable to or even outperform commercial large models in financial news analysis tasks. ### Summary The core problem of this paper is to develop a cost - effective and high - performance model for real - time analysis and classification of financial news in the cryptocurrency market. Through optimizing the data labeling process, fine - tuning pre - trained models, and strict performance evaluation, researchers hope to provide a feasible solution for small and medium - sized enterprises to deal with the complexity and volatility of the cryptocurrency market.