Mutual Information as Intrinsic Reward of Reinforcement Learning Agents for On-demand Ride Pooling

Xianjie Zhang,Jiahao Sun,Chen Gong,Kai Wang,Yifei Cao,Hao Chen,Hao Chen,Yu Liu
2024-01-07
Abstract:The emergence of on-demand ride pooling services allows each vehicle to serve multiple passengers at a time, thus increasing drivers' income and enabling passengers to travel at lower prices than taxi/car on-demand services (only one passenger can be assigned to a car at a time like UberX and Lyft). Although on-demand ride pooling services can bring so many benefits, ride pooling services need a well-defined matching strategy to maximize the benefits for all parties (passengers, drivers, aggregation companies and environment), in which the regional dispatching of vehicles has a significant impact on the matching and revenue. Existing algorithms often only consider revenue maximization, which makes it difficult for requests with unusual distribution to get a ride. How to increase revenue while ensuring a reasonable assignment of requests brings a challenge to ride pooling service companies (aggregation companies). In this paper, we propose a framework for vehicle dispatching for ride pooling tasks, which splits the city into discrete dispatching regions and uses the reinforcement learning (RL) algorithm to dispatch vehicles in these regions. We also consider the mutual information (MI) between vehicle and order distribution as the intrinsic reward of the RL algorithm to improve the correlation between their distributions, thus ensuring the possibility of getting a ride for unusually distributed requests. In experimental results on a real-world taxi dataset, we demonstrate that our framework can significantly increase revenue up to an average of 3\% over the existing best on-demand ride pooling method.
Artificial Intelligence,Machine Learning,Systems and Control
What problem does this paper attempt to address?
### Problems the paper attempts to solve The paper attempts to solve the vehicle scheduling and matching problems in on - demand carpooling services. Specifically, the paper focuses on the following challenges: 1. **Supply - demand imbalance**: The travel demand in different areas of the city is unevenly distributed, resulting in a situation where vehicles are in short supply in some areas while in excess in others. This requires the on - demand carpooling system to schedule vehicles reasonably so as to serve as many passengers with abnormal demand distribution as possible and reduce the pick - up distance and time of vehicles. 2. **Dependency between scheduling and matching**: Scheduling decisions determine the order area range that vehicles can choose during the matching process, and the expected revenue of the matching results is fed back into the scheduling decisions, forming a dependency between scheduling and matching. 3. **Multi - request combined matching problem**: Different from the ordinary single - passenger ride service, the on - demand carpooling system needs to combine multiple passengers on the same route into one "trip" and match them to the same vehicle. This transforms the bipartite graph matching problem between passengers and vehicles into a tripartite graph matching problem among requests, trips and vehicles, increasing the complexity of the problem. To address these challenges, the paper proposes a vehicle scheduling framework based on Reinforcement Learning (RL). This framework divides the city into discrete scheduling areas and uses Mutual Information (MI) as an intrinsic reward value to optimize the correlation between vehicles and order distributions, thereby improving the overall revenue of the system. Experimental results show that the performance of this framework on the actual taxi dataset is better than the existing best on - demand carpooling methods, with an average revenue increase of 3%.