Optimizing Portfolio with Two-Sided Transactions and Lending: A Reinforcement Learning Framework

Ali Habibnia,Mahdi Soltanzadeh
2024-08-10
Abstract:This study presents a Reinforcement Learning (RL)-based portfolio management model tailored for high-risk environments, addressing the limitations of traditional RL models and exploiting market opportunities through two-sided transactions and lending. Our approach integrates a new environmental formulation with a Profit and Loss (PnL)-based reward function, enhancing the RL agent's ability in downside risk management and capital optimization. We implemented the model using the Soft Actor-Critic (SAC) agent with a Convolutional Neural Network with Multi-Head Attention (CNN-MHA). This setup effectively manages a diversified 12-crypto asset portfolio in the Binance perpetual futures market, leveraging USDT for both granting and receiving loans and rebalancing every 4 hours, utilizing market data from the preceding 48 hours. Tested over two 16-month periods of varying market volatility, the model significantly outperformed benchmarks, particularly in high-volatility scenarios, achieving higher return-to-risk ratios and demonstrating robust profitability. These results confirm the model's effectiveness in leveraging market dynamics and managing risks in volatile environments like the cryptocurrency market.
Portfolio Management,Machine Learning
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to address the issue of portfolio management using Reinforcement Learning (RL) in high-risk environments and proposes improvements to traditional RL models. Specifically, the study optimizes portfolio management through the following points: 1. **Environment Modeling**: A new environment modeling method is proposed, combining a Profit and Loss (PnL)-based reward function to enhance the RL agent's ability in downside risk management. 2. **Market Mechanism**: Utilizing two-sided transactions and lending features, loans and loan receipts are conducted through USDT to better leverage market opportunities. 3. **Algorithm Selection**: The Soft Actor-Critic (SAC) agent and Convolutional Neural Network with Multi-Head Attention (CNN-MHA) are employed to effectively manage a portfolio containing 12 cryptocurrency assets. 4. **Empirical Validation**: Empirical tests were conducted in the Binance perpetual futures market, validating the model's effectiveness using data from two 16-month periods representing high and low volatility markets. The study demonstrates that in high volatility market environments, the model significantly outperforms benchmark models, achieving a higher return-to-risk ratio, showcasing its effectiveness in managing risk and optimizing capital allocation in high volatility environments.