Information-Theoretic Generalization Bounds for Batch Reinforcement Learning

Xingtu Liu
DOI: https://doi.org/10.3390/e26110995
IF: 2.738
2024-11-27
Entropy
Abstract:We analyze the generalization properties of batch reinforcement learning (batch RL) with value function approximation from an information-theoretic perspective. We derive generalization bounds for batch RL using (conditional) mutual information. In addition, we demonstrate how to establish a connection between certain structural assumptions on the value function space and conditional mutual information. As a by-product, we derive a high-probability generalization bound via conditional mutual information, which was left open and may be of independent interest.
physics, multidisciplinary
What problem does this paper attempt to address?