A Deep Reinforcement Learning Approach for Trading Optimization in the Forex Market with Multi-Agent Asynchronous Distribution

Davoud Sarani,Dr. Parviz Rashidi-Khazaee
2024-05-30
Abstract:In today's forex market traders increasingly turn to algorithmic trading, leveraging computers to seek more profits. Deep learning techniques as cutting-edge advancements in machine learning, capable of identifying patterns in financial data. Traders utilize these patterns to execute more effective trades, adhering to algorithmic trading rules. Deep reinforcement learning methods (DRL), by directly executing trades based on identified patterns and assessing their profitability, offer advantages over traditional DL approaches. This research pioneers the application of a multi-agent (MA) RL framework with the state-of-the-art Asynchronous Advantage Actor-Critic (A3C) algorithm. The proposed method employs parallel learning across multiple asynchronous workers, each specialized in trading across multiple currency pairs to explore the potential for nuanced strategies tailored to different market conditions and currency pairs. Two different A3C with lock and without lock MA model was proposed and trained on single currency and multi-currency. The results indicate that both model outperform on Proximal Policy Optimization model. A3C with lock outperforms other in single currency training scenario and A3C without Lock outperforms other in multi-currency scenario. The findings demonstrate that this approach facilitates broader and faster exploration of different currency pairs, significantly enhancing trading returns. Additionally, the agent can learn a more profitable trading strategy in a shorter time.
Computational Engineering, Finance, and Science,Artificial Intelligence,Computational Complexity
What problem does this paper attempt to address?
This paper attempts to address the problem of optimizing trading strategies in the foreign exchange market using a multi-agent asynchronous distributed deep reinforcement learning approach. Specifically, the authors propose a deep reinforcement learning (DRL) method based on a multi-agent (MA) framework, utilizing the state-of-the-art Asynchronous Advantage Actor-Critic (A3C) algorithm to enhance trading returns. ### Main Issues 1. **Limitations of Traditional Trading Methods**: - Traditional manual trading methods are easily influenced by emotions and psychological factors, leading to erroneous decisions. - Rule-based algorithms can identify specific market patterns but lack flexibility and adaptability. - Supervised learning methods have limited predictive power in financial markets and cannot directly execute trading decisions. 2. **Shortcomings of Single-Agent Methods**: - Single-agent methods perform poorly in handling complex and dynamic financial markets, struggling to adapt to multiple currency pairs and different market conditions. - Single-agent learning efficiency is low, requiring a long time to explore and optimize strategies. ### Solutions 1. **Multi-Agent Asynchronous Distributed A3C Algorithm**: - Utilize multiple asynchronously working agents, each focusing on trading different currency pairs, to achieve broader market exploration. - Accelerate the learning process and improve the generalization and adaptability of strategies through parallel learning and knowledge sharing. - Propose two A3C models: with lock and without lock, trained and tested in single-currency and multi-currency scenarios, respectively. 2. **Data Preparation and Reward Function**: - Use candlestick data from the forex market, generating new features through normalization. - Design a reward function that provides feedback to the agents based on the profitability or loss of trading decisions to optimize their trading strategies. 3. **Model Structure**: - Employ a unified neural network model, including LSTM layers and fully connected layers, to process normalized time series data and trading decisions. - The model output is divided into two parts: one for generating trading decisions and the other for evaluating the quality of the decisions. ### Experimental Results - Experimental results show that the proposed multi-agent A3C model outperforms the Proximal Policy Optimization (PPO) model in both single-currency and multi-currency scenarios. - The A3C with lock model performs best in single-currency training scenarios, while the A3C without lock model performs better in multi-currency scenarios. - This method can explore different currency pairs more quickly, significantly improve trading returns, and learn more effective trading strategies in a shorter time. ### Conclusion This paper addresses the limitations of traditional trading methods and single-agent methods in the forex market by introducing a multi-agent asynchronous distributed A3C algorithm, enhancing the optimization and adaptability of trading strategies.