Effective credit assignment deep policy gradient multi-agent reinforcement learning for vehicle dispatch

Xiaohui Huang,Xiong Zhang,Jiahao Ling,Xuebo Cheng
DOI: https://doi.org/10.1007/s10489-023-04689-z
IF: 5.3
2023-07-11
Applied Intelligence
Abstract:With the emergence of online car-hailing platforms, more travel options and convenience have been provided to people. However, the ’tidal phenomenon’ of travel often leads to an imbalance between the supply and demand of vehicles, especially during peak hours. In this paper, we propose a reinforcement learning algorithm for fleet dispatch using effective Credit Assignment Deep Policy Gradient (CADPG). The CADPG model first learns an action for each agent (i.e., vehicle) with the local states of the vehicle through the policy network. Secondly, a set of parameters for credit assignment to compute the total Q value is learned by a hyper-network with the input of the global state. Finally, we feed the joint action vectors and the hyperparameters produced by the hyper-network into the critic network to obtain the total Q value of the joint actions. Experimental results conducted on real datasets show that our proposed method outperforms the compared algorithms.
computer science, artificial intelligence
What problem does this paper attempt to address?