Abstract:Owing to the benefits for customers (lower prices), drivers (higher revenues), aggregation companies (higher revenues) and the environment (fewer vehicles), on-demand ride pooling (e.g., Uber pool, Grab Share) has become quite popular. The significant computational complexity of matching vehicles to combinations of requests has meant that traditional ride pooling approaches are myopic in that they do not consider the impact of current matches on future value for vehicles/drivers. Recently, Neural Approximate Dynamic Programming (NeurADP) has employed value decomposition with Approximate Dynamic Programming (ADP) to outperform leading approaches by considering the impact of an individual agent's (vehicle) chosen actions on the future value of that agent. However, in order to ensure scalability and facilitate city-scale ride pooling, NeurADP completely ignores the impact of other agents actions on individual agent/vehicle value. As demonstrated in our experimental results, ignoring the impact of other agents actions on individual value can have a significant impact on the overall performance when there is increased competition among vehicles for demand. Our key contribution is a novel mechanism based on computing conditional expectations through joint conditional probabilities for capturing dependencies on other agents actions without increasing the complexity of training or decision making. We show that our new approach, Conditional Expectation based Value Decomposition (CEVD) outperforms NeurADP by up to 9.76% in terms of overall requests served, which is a significant improvement on a city wide benchmark taxi dataset.

What problem does this paper attempt to address?

The paper attempts to solve the matching problem in on - demand ride - pooling services, specifically the large - scale urban - level on - demand ride - pooling matching problem (Ride - Pool Matching Problem, RMP). In this problem, the goal is to effectively allocate user requests to vehicles in order to maximize certain overall objectives (such as the number of satisfied requests or revenue), while satisfying quality constraints (for example, the delay in reaching the destination due to sharing does not exceed 10 minutes) and matching constraints (a request can only be assigned to one vehicle, and a vehicle can only be assigned to one set of request combinations). ### Core Problem of the Paper Due to high computational complexity, traditional ride - pooling methods often only consider the impact of the current matching pair on the immediate value during online decision - making, while ignoring the impact of the current matching pair on the future value. In addition, although existing methods (such as NeurADP) improve scalability by decomposing the joint value function, they completely ignore the impact of the behaviors of other agents (vehicles) on the future value of a single agent when dealing with large - scale urban - level problems. This ignorance can lead to performance degradation, especially when the competition for demand among vehicles increases. ### Main Contributions To solve the above problems, the authors propose the Conditional Expectation - based Value Decomposition method (CEVD). CEVD not only considers the impact of the current allocation on the future value, but also captures the impact of the behaviors of other agents on the individual value by using conditional probability, thus avoiding an increase in training and decision - making complexity. Experimental results show that CEVD outperforms NeurADP on urban - level benchmark taxi datasets and can increase the number of service requests by up to 9.76%. ### Formula Representation - **Conditional Expectation Formula**: \[ Q_{CG}^i(s, a)=f_i(a_i|s)+\sum_{j|(i, j)\in E, a_j\in A_j}P(a_j|a_i, s)f_j(a_j|s) \] where \(f_i(a_i|s)\) represents the value of agent \(i\), \(P(a_j|a_i, s)\) represents the conditional probability that agent \(j\) takes action \(a_j\) in state \(s\), and \(f_j(a_j|s)\) represents the value of agent \(j\). - **Individual Value Update Formula**: \[ \hat{V}_i(s_i, f_t)=\frac{1}{1 + \lambda}\left[V(s_i, f_t)+\lambda\frac{1}{|C_k|- 1}\sum_{j\in C_k, j\neq i}\sum_{g\in F_t^j}P_j(g|s_i^t, f)V_j(s_j, g_t)\right] \] where \(\lambda\) is a learnable parameter used to control the importance of neighboring agents. Through these improvements, CEVD can significantly improve the performance of on - demand ride - pooling systems while maintaining scalability.

Conditional Expectation based Value Decomposition for Scalable On-Demand Ride Pooling

An Enhanced Approximate Dynamic Programming Approach to On-demand Ride Pooling

A Grouping Approach to Ridesplitting Optimization

Modeling and Managing Mixed On-Demand Ride Services of Human-Driven Vehicles and Autonomous Vehicles

Dynamic Optimization Strategies for On-Demand Ride Services Platform: Surge Pricing, Commission Rate, and Incentives

Future Aware Pricing and Matching for Sustainable On-demand Ride Pooling

A Distributed Model-Free Ride-Sharing Approach for Joint Matching, Pricing, and Dispatching using Deep Reinforcement Learning

On-demand high-capacity ride-sharing via dynamic trip-vehicle assignment

Approximate Dynamic Programming for Planning a Ride-Sharing System using Autonomous Fleets of Electric Vehicles

DeepPool: Distributed Model-Free Algorithm for Ride-Sharing Using Deep Reinforcement Learning

Competitive Ratios for Online Multi-capacity Ridesharing

Equilibrium Inverse Reinforcement Learning for Ride-hailing Vehicle Network

Spatiotemporal Pricing and Fleet Management of Autonomous Mobility-on-Demand Networks: A Decomposition and Dynamic Programming Approach with Bounded Optimality Gap

Predictive Vehicle Repositioning for On-Demand Ride-Pooling Services

An Application of Network Lasso Optimization For Ride Sharing Prediction

Assignment and Pricing of Shared Rides in Ride-Sourcing using Combinatorial Double Auctions

A Framework for the Joint Optimization of Assignment and Pricing in Mobility-on-Demand Systems with Shared Rides

Impact of Detour-Aware Policies on Maximizing Profit in Ridesharing

Wait to be Faster: a Smart Pooling Framework for Dynamic Ridesharing

Congestion-aware Ride-pooling in Mixed Traffic for Autonomous Mobility-on-Demand Systems

Mutual Information as Intrinsic Reward of Reinforcement Learning Agents for On-demand Ride Pooling