Deep Reinforcement Learning for Active High Frequency Trading

Antonio Briola,Jeremy Turiel,Riccardo Marcaccioli,Alvaro Cauderan,Tomaso Aste
2023-08-19
Abstract:We introduce the first end-to-end Deep Reinforcement Learning (DRL) based framework for active high frequency trading in the stock market. We train DRL agents to trade one unit of Intel Corporation stock by employing the Proximal Policy Optimization algorithm. The training is performed on three contiguous months of high frequency Limit Order Book data, of which the last month constitutes the validation data. In order to maximise the signal to noise ratio in the training data, we compose the latter by only selecting training samples with largest price changes. The test is then carried out on the following month of data. Hyperparameters are tuned using the Sequential Model Based Optimization technique. We consider three different state characterizations, which differ in their LOB-based meta-features. Analysing the agents' performances on test data, we argue that the agents are able to create a dynamic representation of the underlying environment. They identify occasional regularities present in the data and exploit them to create long-term profitable trading strategies. Indeed, agents learn trading strategies able to produce stable positive returns in spite of the highly stochastic and non-stationary environment.
Machine Learning,Artificial Intelligence,Multiagent Systems,Trading and Market Microstructure
What problem does this paper attempt to address?
This paper attempts to address the problem of automating active High Frequency Trading (HFT) in the stock market. Specifically, the authors introduce an end-to-end framework based on Deep Reinforcement Learning (DRL) to train agents to buy and sell single units of stock in high-frequency trading, particularly for Intel Corporation's stock. The main objectives of the paper include: 1. **Developing a DRL framework**: Constructing a DRL framework capable of automatically learning and executing high-frequency trading strategies that can generate stable positive returns in highly stochastic and non-stationary market environments. 2. **Improving the signal-to-noise ratio**: Enhancing the signal-to-noise ratio in the training data by selecting time periods with the largest price movements to construct the training dataset, thereby making the training more effective. 3. **Optimizing hyperparameters**: Using Sequential Model Based Optimization (SMBO) techniques to tune the model's hyperparameters to improve the agent's performance. 4. **Exploring different state representations**: Investigating three different state representation methods that vary in the meta-features of the Limit Order Book (LOB) to evaluate their impact on the agent's performance. 5. **Validating long-term profitability**: Testing whether the agent can identify patterns in the data and leverage these patterns to create long-term profitable trading strategies. In summary, this paper aims to demonstrate how DRL technology can be used to achieve automated trading in high-frequency trading and to explore the feasibility and effectiveness of this technology in practical applications.