Abstract:The ride-hailing service offered by mobility-on-demand platforms, such as Uber and Didi Chuxing, has greatly facilitated people's traveling and commuting, and become increasingly popular in recent years. Efficiency (e.g., gross merchandise volume) has always been an important metric for such platforms. However, only focusing on the efficiency inevitably ignores the fairness of driver incomes, which could impair the sustainability of the overall ride-hailing system in the long run. To optimize the aforementioned two essential metrics, order dispatching and driver repositioning play an important role, as they impact not only the immediate, but also the future order-serving outcomes of drivers. Thus, in this paper, we aim to exploit joint order dispatching and driver repositioning to optimize both the long-term efficiency and fairness for ride-hailing platforms. To address this problem, we propose a novel multi-agent reinforcement learning framework, referred to as JDRL, to help drivers make distributed order selection and repositioning decisions. Specifically, to cope with the variable action space, JDRL segments the action space into a fixed number of action groups, and fixes the policy output dimension for order selection as the number of action groups. In terms of the fairness criterion, JDRL adopts the max-min fairness, and augments the vanilla policy gradient to an iterative training algorithm that alternates between a minimization step and a policy improvement step to maximize both the worst and the overall performance of agents. In addition, we provide the theoretical convergence guarantee of our JDRL training algorithm even under non-convex policy networks and stochastic gradient updating. Extensive experiments are conducted with three public real-world ride-hailing order datasets, including over 2 million orders in Haikou, China, over 5 million orders in Chengdu, China, and over 6 million orders in New York City, USA. Experimental results show that JDRL demonstrates a consistent advantage compared to state-of-the-art baselines in terms of both efficiency and fairness. To the best of our knowledge, this is the first work that exploits joint order dispatching and driver repositioning to optimize both the long-term efficiency and fairness in a ride-hailing system.

A Reinforcement Learning and Prediction-Based Lookahead Policy for Vehicle Repositioning in Online Ride-Hailing Systems

Spatial-temporal Pricing for Ride-Sourcing Platform with Reinforcement Learning

i-Rebalance: Personalized Vehicle Repositioning for Supply Demand Balance

Reinforcement Learning from Optimization Proxy for Ride-Hailing Vehicle Relocation

Spatio-temporal Incentives Optimization for Ride-hailing Services with Offline Deep Reinforcement Learning

Predictive Vehicle Repositioning for On-Demand Ride-Pooling Services

DROP: Deep relocating option policy for optimal ride-hailing vehicle repositioning

A prediction-based iterative Kuhn-Munkres approach for service vehicle reallocation in ride-hailing

Optimizing Long-Term Efficiency and Fairness in Ride-Hailing via Joint Order Dispatching and Driver Repositioning

Multi-Agent Reinforcement Learning for Order-dispatching via Order-Vehicle Distribution Matching

HMDRL: Hierarchical Mixed Deep Reinforcement Learning to Balance Vehicle Supply and Demand

Dynamic Balancing-Charging Management for Shared Autonomous Electric Vehicle Systems: A Two-Stage Learning-Based Approach

Where to go: Agent Guidance with Deep Reinforcement Learning in A City-Scale Online Ride-Hailing Service

Optimizing Online Matching for Ride-Sourcing Services with Multi-Agent Deep Reinforcement Learning

Supply-Demand-aware Deep Reinforcement Learning for Dynamic Fleet Management

Towards More Efficient Shared Autonomous Mobility: A Learning-Based Fleet Repositioning Approach

A Reinforcement Learning Approach for Dynamic Rebalancing in Bike-Sharing System

Optimising Stochastic Routing for Taxi Fleets with Model Enhanced Reinforcement Learning

Dual Policy Reinforcement Learning for Real-time Rebalancing in Bike-sharing Systems

Rethinking Order Dispatching in Online Ride-Hailing Platforms

A prediction-based forward-looking vehicle dispatching strategy for dynamic ride-pooling