Observation-Based Performance Sensitivity Analysis for Pomdps

Zhe Ji,Xiaofeng Jiang,Hongsheng Xi
DOI: https://doi.org/10.1109/chicc.2015.7259887
2015-01-01
Abstract:In this paper, the performance sensitivity analysis for Markov decision processes (MDPs) are generalized to study the partially observable Markov decision processes (POMDPs). The performance derivative formula and the performance difference formula based on observation are derived in this paper. The derivation does not need any overly strict assumptions. In order to find the optimal policy based on observation, an observation-based policy iteration algorithm is designed. An example is presented to show the applicability of the algorithm finally.
What problem does this paper attempt to address?