Abstract:The ride-hailing service offered by mobility-on-demand platforms, such as Uber and Didi Chuxing, has greatly facilitated people's traveling and commuting, and become increasingly popular in recent years. Efficiency (e.g., gross merchandise volume) has always been an important metric for such platforms. However, only focusing on the efficiency inevitably ignores the fairness of driver incomes, which could impair the sustainability of the overall ride-hailing system in the long run. To optimize the aforementioned two essential metrics, order dispatching and driver repositioning play an important role, as they impact not only the immediate, but also the future order-serving outcomes of drivers. Thus, in this paper, we aim to exploit joint order dispatching and driver repositioning to optimize both the long-term efficiency and fairness for ride-hailing platforms. To address this problem, we propose a novel multi-agent reinforcement learning framework, referred to as JDRL, to help drivers make distributed order selection and repositioning decisions. Specifically, to cope with the variable action space, JDRL segments the action space into a fixed number of action groups, and fixes the policy output dimension for order selection as the number of action groups. In terms of the fairness criterion, JDRL adopts the max-min fairness, and augments the vanilla policy gradient to an iterative training algorithm that alternates between a minimization step and a policy improvement step to maximize both the worst and the overall performance of agents. In addition, we provide the theoretical convergence guarantee of our JDRL training algorithm even under non-convex policy networks and stochastic gradient updating. Extensive experiments are conducted with three public real-world ride-hailing order datasets, including over 2 million orders in Haikou, China, over 5 million orders in Chengdu, China, and over 6 million orders in New York City, USA. Experimental results show that JDRL demonstrates a consistent advantage compared to state-of-the-art baselines in terms of both efficiency and fairness. To the best of our knowledge, this is the first work that exploits joint order dispatching and driver repositioning to optimize both the long-term efficiency and fairness in a ride-hailing system.

Joint Optimization of Pricing, Dispatching and Repositioning in Ride-Hailing with Multiple Models Interplayed Reinforcement Learning

Spatial-temporal Pricing for Ride-Sourcing Platform with Reinforcement Learning

Optimizing Long-Term Efficiency and Fairness in Ride-Hailing under Budget Constraint Via Joint Order Dispatching and Driver Repositioning

Optimizing Long-Term Efficiency and Fairness in Ride-Hailing via Joint Order Dispatching and Driver Repositioning

Dynamic Optimization Strategies for On-Demand Ride Services Platform: Surge Pricing, Commission Rate, and Incentives

Multi-Agent Reinforcement Learning for Order-dispatching via Order-Vehicle Distribution Matching

Coride: Joint Order Dispatching And Fleet Management For Multi-Scale Ride-Hailing Platforms

A Distributed Model-Free Ride-Sharing Approach for Joint Matching, Pricing, and Dispatching using Deep Reinforcement Learning

Promoting Collaborative Dispatching in the Ride-Sourcing Market With a Third-Party Integrator

Optimizing Online Matching for Ride-Sourcing Services with Multi-Agent Deep Reinforcement Learning

Rethinking Order Dispatching in Online Ride-Hailing Platforms

A Framework for the Joint Optimization of Assignment and Pricing in Mobility-on-Demand Systems with Shared Rides

An End-to-End Reinforcement Learning Based Approach for Micro-View Order-Dispatching in Ride-Hailing

Scalable Deep Reinforcement Learning for Ride-Hailing

Multi-Agent Mix Hierarchical Deep Reinforcement Learning for Large-Scale Fleet Management

Supply-Demand-aware Deep Reinforcement Learning for Dynamic Fleet Management

Dynamic Balancing-Charging Management for Shared Autonomous Electric Vehicle Systems: A Two-Stage Learning-Based Approach

Vehicle Dispatching and Routing of On-Demand Intercity Ride-Pooling Services: A Multi-Agent Hierarchical Reinforcement Learning Approach

Learn to Earn: Enabling Coordination within a Ride Hailing Fleet

Optimal Vehicle Dispatching for Ride-sharing Platforms Via Dynamic Pricing

HMDRL: Hierarchical Mixed Deep Reinforcement Learning to Balance Vehicle Supply and Demand