Abstract:Investment portfolios, central to finance, balance potential returns and risks. This paper introduces a hybrid approach combining Markowitz's portfolio theory with reinforcement learning, utilizing knowledge distillation for training agents. In particular, our proposed method, called KDD (Knowledge Distillation DDPG), consist of two training stages: supervised and reinforcement learning stages. The trained agents optimize portfolio assembly. A comparative analysis against standard financial models and AI frameworks, using metrics like returns, the Sharpe ratio, and nine evaluation indices, reveals our model's superiority. It notably achieves the highest yield and Sharpe ratio of 2.03, ensuring top profitability with the lowest risk in comparable return scenarios.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to optimize the construction and management of financial portfolios by combining Markowitz Portfolio Theory and Reinforcement Learning (RL), especially the Deep Deterministic Policy Gradient (DDPG) algorithm. Specifically, the paper aims to develop a new hybrid method - Knowledge - Distilled Reinforcement Learning (KDD) - to improve the performance of portfolios in the actual market, achieving higher yields and lower risks.
### Core problems of the paper:
1. **Balancing returns and risks**: The core problem of financial portfolios is to find an optimal allocation that can both maximize expected returns and minimize risks. Although the traditional Markowitz model provides a theoretically optimal solution, it has limitations in practical applications, especially when dealing with complex and changeable market environments.
2. **Challenges in the application of reinforcement learning**: Although reinforcement learning has achieved remarkable success in many fields, when it is applied to financial portfolio management, it faces many challenges, such as continuous action spaces, large data noise, and differences between historical data and the real - market.
3. **Integrating traditional and modern technologies**: The paper attempts to combine the classic Markowitz portfolio theory with modern deep - learning and reinforcement - learning technologies. Through the method of knowledge distillation, the reinforcement - learning model can "learn" effective investment strategies from existing classic models, thereby enhancing its performance in the actual market.
### Specific objectives:
- **Develop the KDD model**: Through two - stage training (the supervised - learning stage and the reinforcement - learning stage), train an intelligent agent that can effectively optimize portfolios.
- **Supervised - learning stage**: Use the optimal portfolios generated by the Markowitz model as the teacher model, and transfer these strategies to the DDPG model through knowledge distillation.
- **Reinforcement - learning stage**: In the real - market environment, further optimize the investment - decision - making ability of the DDPG model through interaction with the environment.
- **Evaluate the model performance**: Through comparative experiments with traditional financial models and other AI frameworks, verify the superiority of the KDD model in multiple evaluation indicators (such as total return, Sharpe ratio, maximum drawdown, etc.).
### Innovation points of the paper:
- **Application of knowledge distillation**: Through knowledge distillation, the experience of the Markowitz model is incorporated into the DDPG model, enabling the reinforcement - learning model to have a certain investment - strategy foundation at the initial stage, thereby accelerating the learning process and improving the final performance.
- **Two - stage training method**: Combine the advantages of supervised learning and reinforcement learning. First, let the model quickly master basic investment strategies through supervised learning, and then continuously optimize and adapt to the complex market environment through reinforcement learning.
In summary, the goal of this paper is to develop a more efficient and intelligent portfolio - management method by combining traditional financial theory and modern artificial - intelligence technologies to deal with the complexity and uncertainty in the financial market.