Abstract:This paper introduces BreakGPT, a novel large language model (LLM) architecture adapted specifically for time series forecasting and the prediction of sharp upward movements in asset prices. By leveraging both the capabilities of LLMs and Transformer-based models, this study evaluates BreakGPT and other Transformer-based models for their ability to address the unique challenges posed by highly volatile financial markets. The primary contribution of this work lies in demonstrating the effectiveness of combining time series representation learning with LLM prediction frameworks. We showcase BreakGPT as a promising solution for financial forecasting with minimal training and as a strong competitor for capturing both local and global temporal dependencies.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to predict the sharp rise in asset prices in highly volatile financial markets, especially the price surges in the cryptocurrency market. Due to the random nature of financial data and the influence of external factors, predicting financial market behavior, especially identifying price surges in the cryptocurrency market, has always been a challenging problem. By introducing a new large - language - model architecture - BreakGPT, which combines the advantages of large - language models (LLMs) and Transformer - based models, the paper aims to improve the accuracy of time - series prediction, especially in capturing short - term and long - term time dependencies.
### Main Contributions
1. **Developed an improved time - series prediction architecture**: Based on GPT - 2, it adapts to time - series prediction by using domain - specific prompts and embeddings.
2. **Implemented multiple Transformer - based models**: These models use the attention mechanism and convolutional layers to process financial time - series data.
3. **Evaluated these models on a real - world cryptocurrency dataset**: Analyzed their effectiveness in predicting price surges.
### Methods
- **Data Preparation**: Use the price data of Solana cryptocurrency from February 1st to August 15th, with the data from July 15th to August 15th as the test set. The data contains OHLC (open, high, low, close) data and additional features such as Simple Moving Average (SMA), Exponential Moving Average (EMA), Relative Strength Index (RSI), and Bollinger Bands (BB).
- **Target Creation**: Determine significant price changes by identifying higher highs (HH), lower lows (LL), higher lows (HL), and lower highs (LH), and applying a volatility filter.
- **Model Architecture**:
- **Simple Transformer**: As a baseline model, it includes an embedding layer, multi - head attention, positional encoding, and a fully - connected layer.
- **ConvTransformer**: Adds a one - dimensional convolutional layer on the basis of the Simple Transformer to capture short - term and long - term patterns.
- **BreakGPT**: A modified version based on GPT - 2, using prompts to guide the model to focus on detecting sharply rising price trends.
### Results
- **Performance Evaluation**: Evaluate the performance of the models in detecting upward trends (class 1) through precision, recall, F1 - score, and accuracy. Due to class imbalance, the F1 - score of class 1 is given priority.
- **Result Comparison**:
- **Simple Transformer**: Performs poorly in detecting upward trends, with an F1 - score of only 0.12.
- **ConvTransformer**: Significantly improves performance by integrating one - dimensional convolutional layers, residual connections, and SILU activation functions, with an F1 - score reaching 0.20.
- **BreakGPT**: Although trained fewer times, it performs close to the ConvTransformer, with an F1 - score of 0.16.
### Discussion
- **Simple Transformer**: As a baseline model, it is difficult to capture the necessary patterns when dealing with volatile financial data.
- **ConvTransformer**: Significantly improves the performance of predicting upward trends by enhancing the model's ability to capture local and global patterns.
- **BreakGPT**: Shows strong potential, especially in a short time, by guiding the model to focus on key features through prompts.
### Conclusions and Future Work
- **Conclusions**: ConvTransformer performs well in capturing short - term and long - term dependencies, and BreakGPT also shows significant potential despite being trained fewer times.
- **Future Work**: Explore more complex LLM architectures to further improve prediction accuracy; solve the class - imbalance problem through techniques such as oversampling, class - weighting, or ensemble learning to optimize model performance, especially in detecting upward trends in unbalanced financial datasets.