Value function factorization with dynamic weighting for deep multi-agent reinforcement learning

Wei Du,Shifei Ding,Lili Guo,Jian Zhang,Chenglong Zhang,Ling Ding
DOI: https://doi.org/10.1016/j.ins.2022.10.042
IF: 8.1
2022-11-01
Information Sciences
Abstract:In many real-world scenarios, multiple agents necessitate coordination with each other because of their limited observation and communication capability. Deep multi-agent reinforcement learning has demonstrated significant success in such challenging settings making use of value decomposition. One of the representative methods is QMIX, which factorizes the multi-agent global Q-value into individual Q-values and limits the joint action Q-value to a monotonic assumption leveraging an implicit mixing method. However, this assumption restricts it to representing certain value functions in which the ordering of an agent’s actions is based on the actions of others. WQMIX presents two weighting schemes to tackle this restriction but the weighting function is simple that limits the performance of methods, more appropriate weighting scheme is required to be considered. To tackle this issue, we present a more complex and accurate weighting scheme, which we call Dynamic Weighting (DW), as opposed to the fixed weighting in WQMIX. Our proposed method DW-QMIX guarantees a more general decomposition than QMIX or WQMIX and places accurate importance on the better joint actions thus leading to obtaining the optimal policy. Extensive experiments on the simulation environments and real-life systems demonstrate that our proposed method outperforms the existing multi-agent reinforcement learning methods.
computer science, information systems
What problem does this paper attempt to address?