A Priori Estimates for Deep Residual Network in Continuous-time Reinforcement Learning

Shuyu Yin,Qixuan Zhou,Fei Wen,Tao Luo
DOI: https://doi.org/10.48550/arxiv.2402.16899
2024-01-01
Abstract:Deep reinforcement learning excels in numerous large-scale practicalapplications. However, existing performance analyses ignores the uniquecharacteristics of continuous-time control problems, is unable to directlyestimate the generalization error of the Bellman optimal loss and require aboundedness assumption. Our work focuses on continuous-time control problemsand proposes a method that is applicable to all such problems where thetransition function satisfies semi-group and Lipschitz properties. Under thismethod, we can directly analyze the a priori generalization error of theBellman optimal loss. The core of this method lies in two transformations ofthe loss function. To complete the transformation, we propose a decompositionmethod for the maximum operator. Additionally, this analysis method does notrequire a boundedness assumption. Finally, we obtain an a priorigeneralization error without the curse of dimensionality.
What problem does this paper attempt to address?