Adaptive Learning Rates for Multi-Agent Reinforcement Learning

Jiechuan Jiang,Zongqing Lu
2023-01-01
Abstract:In multi-agent reinforcement learning (MARL), the learning rates of actors and critic are mostly hand-tuned and fixed. This not only requires heavy tuning but more importantly limits the learning. With adaptive learning rates according to gradient patterns, some optimizers have been proposed for general optimizations, which however do not take into consideration the characteristics of MARL. In this paper, we propose AdaMa to bring adaptive learning rates to cooperative MARL. AdaMa evaluates the contribution of actors' updates to the improvement of Q-value and adaptively updates the learning rates of actors to the direction of maximally improving the Q-value. AdaMa could also dynamically balance the learning rates between the critic and actors according to their varying effects on the learning. Moreover, AdaMa can incorporate the second-order approximation to capture the contribution of pairwise actors' updates and thus more accurately updates the learning rates of actors. Empirically, we show that AdaMa could accelerate learning and improve performance in a variety of multi-agent scenarios. More importantly, AdaMa does not require heavy hyperparameter tuning and thus significantly reduces the training cost. The visualizations of learning rates during training clearly explain how and why AdaMa works.
What problem does this paper attempt to address?