Markov Decision Processes with Noisy State Observation

Amirhossein Afsharrad,Sanjay Lall
2023-12-14
Abstract:This paper addresses the challenge of a particular class of noisy state observations in Markov Decision Processes (MDPs), a common issue in various real-world applications. We focus on modeling this uncertainty through a confusion matrix that captures the probabilities of misidentifying the true state. Our primary goal is to estimate the inherent measurement noise, and to this end, we propose two novel algorithmic approaches. The first, the method of second-order repetitive actions, is designed for efficient noise estimation within a finite time window, providing identifiable conditions for system analysis. The second approach comprises a family of Bayesian algorithms, which we thoroughly analyze and compare in terms of performance and limitations. We substantiate our theoretical findings with simulations, demonstrating the effectiveness of our methods in different scenarios, particularly highlighting their behavior in environments with varying stationary distributions. Our work advances the understanding of reinforcement learning in noisy environments, offering robust techniques for more accurate state estimation in MDPs.
Machine Learning,Systems and Control
What problem does this paper attempt to address?
This paper attempts to solve the problem of noisy state observations in Markov Decision Processes (MDPs). Specifically, the author focuses on how to model this uncertainty through the confusion matrix, which captures the probability of misidentifying the true state. The main goal of the paper is to estimate the inherent measurement noise and proposes two new algorithmic approaches for this purpose: 1. **Method of Second - Order Repetitive Actions**: - This method aims to efficiently estimate noise within a limited time window and provides identifiability conditions for system analysis. - By repeatedly selecting a certain action and recording the observed state transitions over a period of time, the state distribution can be stabilized, making noise estimation feasible. 2. **Family of Bayesian Algorithms**: - These algorithms perform noise estimation by maintaining and updating the probability distributions of unknown parameters and the current state. - The paper conducts a detailed analysis and comparison of these Bayesian algorithms and explores their performance and limitations. Through these methods, the paper aims to improve the robustness and accuracy of reinforcement learning in noisy environments, especially in different scenarios, particularly in environments with different stationary distributions. The author verifies the effectiveness of these methods through simulation experiments and shows their performance in different situations.