Accelerating Robot Reinforcement Learning with Samples of Different Simulation Precision.

Yong Zhao,Yuanzhao Zhai,Jie Luo,Dawei Feng,Bo Ding,Zhen Li
DOI: https://doi.org/10.1109/hpcc-dss-smartcity-dependsys53884.2021.00080
2021-01-01
Abstract:Training an effective model via reinforcement learning method in a simulated environment demands considerable computational cost. To improve the efficiency of reinforcement learning, most existing efforts have been focusing on learning algorithms or sampling techniques, but little attention has been spent on better utilizing the simulators themselves. In this paper, we propose a novel method to accelerate the training process by sampling with different simulation precisions in the simulator. Specifically, a sequential training mode and a joint training mode are proposed. In the sequential training mode, a basic model is first learned by using samples from a low-precision simulation environment, and then further fine-tuned in a high-precision simulation environment. In the joint training mode, the training procedure is conducted by sampling from the low and high-precision simulation environments simultaneously. Extensive experiments in different robotic scenarios demonstrate that our method achieves almost the same performance as conventional methods, but requires much less training cost, where 35% to 63% of the training time is saved without loss of performance.
What problem does this paper attempt to address?