Abstract:We explored the use of Reinforcement Learning (RL) combined with risk assessment for optimizing investment portfolios. The dynamic nature of trading, compounded by market frictions, the responses of other market participants, and uncertainties, poses challenges to portfolio optimization. The financial market's intricacies make it difficult to model accurately, compounded by regulatory requirements and internal risk policies mandating risk-averse decisions to avoid catastrophic outcomes. To address this, we proposed risk estimation for investor's risk tolerance threshold. Moreover, modern Deep Learning models are adept at approximating complex relationship between abundant data, however, the main drawback we face now a day is generalization of the relationship to the unseen data. Therefore, the epistemic uncertainty can pose risk to the decision making system. This uncertainty is further addressed using a Variational Autoencoder (VAE) to estimate, and Cost Network to backpropogate riskiness through the model to learn actions with safe results. The actions with stable result or lower reward will be avoided due to reward optimization of RL. Consequently, we successfully managed to reduce the risk and uncertainties in the agent testing process. Our risk-constrained RL algorithm demonstrated zero violation of the constraint in the testing phase. This suggests that adopting a risk-averse RL approach could be beneficial for portfolio optimization, particularly for risk-averse investors.

Risk-Averse Trust Region Optimization for Reward-Volatility Reduction

An Off-Policy Trust Region Policy Optimization Method with Monotonic Improvement Guarantee for Deep Reinforcement Learning

Option Hedging with Risk Averse Reinforcement Learning

Policy Optimization with Stochastic Mirror Descent.

Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning

Policy Gradients with Variance Related Risk Criteria

Conditional Value-at-Risk for Quantitative Trading: A Direct Reinforcement Learning Approach

Efficient Risk-Averse Reinforcement Learning

Efficient Off-Policy Safe Reinforcement Learning Using Trust Region Conditional Value at Risk

A Stochastic Trust-Region Framework for Policy Optimization

Matrix Low-Rank Trust Region Policy Optimization

Uncertainty-Aware Reinforcement Learning for Portfolio Optimization

Reinforcement Learning for Credit Index Option Hedging

Uncertainty-Aware Policy Optimization: A Robust, Adaptive Trust Region Approach

CVA Hedging by Risk-Averse Stochastic-Horizon Reinforcement Learning

A policy gradient approach for optimization of smooth risk measures

Average-Reward Reinforcement Learning with Trust Region Methods

Embedding Safety into RL: A New Take on Trust Region Methods

Extreme Risk Mitigation in Reinforcement Learning using Extreme Value Theory

Policy Gradient Methods for Distortion Risk Measures

Trust-Region Stochastic Optimization with Variance Reduction Technique