Efficient Duple Perturbation Robustness in Low-rank MDPs

Yang Hu,Haitong Ma,Bo Dai,Na Li
2024-04-12
Abstract:The pursuit of robustness has recently been a popular topic in reinforcement learning (RL) research, yet the existing methods generally suffer from efficiency issues that obstruct their real-world implementation. In this paper, we introduce duple perturbation robustness, i.e. perturbation on both the feature and factor vectors for low-rank Markov decision processes (MDPs), via a novel characterization of $(\xi,\eta)$-ambiguity sets. The novel robust MDP formulation is compatible with the function representation view, and therefore, is naturally applicable to practical RL problems with large or even continuous state-action spaces. Meanwhile, it also gives rise to a provably efficient and practical algorithm with theoretical convergence rate guarantee. Examples are designed to justify the new robustness concept, and algorithmic efficiency is supported by both theoretical bounds and numerical simulations.
Machine Learning,Optimization and Control
What problem does this paper attempt to address?
This paper aims to address the performance gap between simulation and reality in reinforcement learning, specifically how to design an algorithm that is both computationally efficient and capable of handling dual perturbations of feature and factor uncertainties in low-rank Markov decision processes. Current methods primarily focus on tabular MDPs, which suffer from computational complexity and sample complexity issues. The paper proposes a new concept of robustness - (ξ, η)-rectangular fuzzy sets, which is applicable to low-rank MDPs, and designs an algorithm called R2PG. This algorithm has theoretical convergence rate guarantees and can be effectively applied to problems with large or continuous state-action spaces.