Skill Enhancement Learning with Knowledge Distillation

Naijun Liu,Fuchun Sun,Bin Fang,Huaping Liu
DOI: https://doi.org/10.1007/s11432-023-4016-0
2024-01-01
Abstract:Skill learning through reinforcement learning has significantly progressed in recent years. However, it often struggles to efficiently find optimal or near-optimal policies due to the inherent trial-and-error exploration in reinforcement learning. Although algorithms have been proposed to enhance skill learning efficacy, there is still much room for improvement in terms of skill learning performance and training stability. In this paper, we propose an algorithm called skill enhancement learning with knowledge distillation (SELKD), which integrates multiple actors and multiple critics for skill learning. SELKD employs knowledge distillation to establish a mutual learning mechanism among actors. To mitigate critic overestimation bias, we introduce a novel target value calculation method. We also perform theoretical analysis to ensure the convergence of SELKD. Finally, experiments are conducted on several continuous control tasks, illustrating the effectiveness of the proposed algorithm.
What problem does this paper attempt to address?