On Efficient Sampling in Offline Reinforcement Learning

Qing-Shan Jia
2024-01-01
Abstract:Offline reinforcement learning has attracted growing attention due to the advances in simulation technology and human-machine interaction. A critical challenge is how to efficiently sample in the state and action space to initiate each simulation. Effective sampling could significantly reduce the offline learning time and reduce the cost. We consider this important problem in this work and make the following contributions. First, we convert the sampling problem to the maximization of the probability of correctly selecting (PCS) the best policy under the given computing budget. Second, we develop an algorithm that asymptotically maximizes this PCS, and prove this property mathematically. We hope this work may bring closer the reinforcement learning and simulation-based optimization communities.
What problem does this paper attempt to address?