Markov Decision Processes under Risk Sensitivity: A Discount Vanishing Approach

Tanhao Huang,Jinwen Chen
DOI: https://doi.org/10.1016/j.jmaa.2023.128026
IF: 1.417
2024-01-01
Journal of Mathematical Analysis and Applications
Abstract:This paper considers Markov decision processes (MDPs) with risk -sensitivity. The aim is to explore the effects of state transience and non -communication on the optimal control of the system. A vanishing discount approach is investigated. The approximating system is the MDP evaluated by the usual discounted exponential utility. After being appropriately normalized, it is shown that the optimal discounted value functions converge to the optimal risk -sensitive averages as the discount factor goes to 1. These value functions are shown to depend on certain order structure of the state space. In this way it is also proved that the optimal policies for the discounted system converge to the optimal ones for the risk -sensitive system as the discount factor goes to 1. In proving these, an ordered partition of the state space is introduced, which is closely related to the characteristics of state communication, transience and absorption. (c) 2023 Elsevier Inc. All rights reserved.
What problem does this paper attempt to address?