An Adaptive Dual-level Reinforcement Learning Approach for Optimal Trade Execution

Soohan Kim,Jimyeong Kim,Hong Kee Sul,Youngjoon Hong
2023-07-20
Abstract:The purpose of this research is to devise a tactic that can closely track the daily cumulative volume-weighted average price (VWAP) using reinforcement learning. Previous studies often choose a relatively short trading horizon to implement their models, making it difficult to accurately track the daily cumulative VWAP since the variations of financial data are often insignificant within the short trading horizon. In this paper, we aim to develop a strategy that can accurately track the daily cumulative VWAP while minimizing the deviation from the VWAP. We propose a method that leverages the U-shaped pattern of intraday stock trade volumes and use Proximal Policy Optimization (PPO) as the learning algorithm. Our method follows a dual-level approach: a Transformer model that captures the overall(global) distribution of daily volumes in a U-shape, and a LSTM model that handles the distribution of orders within smaller(local) time intervals. The results from our experiments suggest that this dual-level architecture improves the accuracy of approximating the cumulative VWAP, when compared to previous reinforcement learning-based models.
Computational Finance,Trading and Market Microstructure
What problem does this paper attempt to address?