Abstract:Nowadays, most of the taxi drivers have become users of the relocation recommendation service offered by online ride-hailing platforms (e.g., Uber and Didi Chuxing), which could oftentimes lead drivers to places with profitable orders. At the same time, electric taxis (e-taxis) are increasingly adopted and gradually replacing gasoline taxis in today's public transportation systems due to their environmental-friendly nature. Though effective for traditional gasoline taxis, existing relocation recommendation schemes are rather suboptimal for e-taxi drivers' user experience. On one hand, the existing schemes take no account of taxis' refueling decisions, as the refueling durations of gasoline taxis are usually short enough to be ignored. However, the charging duration of the e-taxis spent at charging stations can be as long as hours. Obviously, an e-taxi's battery could be easily depleted by the continuous relocations suggested by existing schemes, and thus will have to be charged for a long time afterwards, making the e-taxi driver miss numerous order-serving opportunities. On the other hand, charging posts are typically sparsely and unevenly distributed across a city. With no consideration of charging opportunities, existing schemes could probably send an e-taxi to an area with no charging post around, even though its battery is running low. To optimize e-taxi drivers' user experience, in this paper, we design a joint charging and relocation recommendation system for e-taxi drivers (CARE). We take the perspective of e-taxi drivers and formulate their decision making as a multi-agent reinforcement learning problem where each e-taxi driver aims to maximize his own cumulative rewards. More specifically, we propose a novel multi-agent mean field hierarchical reinforcement learning (MFHRL) framework. The hierarchical architecture of MFHRL helps-the proposed CARE provide far-sighted charging and relocation recommendations for e-taxi drivers. Besides, we integrate each hierarchical level of MFHRL separately with the mean field approximation to incorporate e-taxis' mutual influences in decision making. We set up a simulator with one of the largest real-world e-taxi datasets in Shenzhen, China, which contains the GPS trajectory data and transaction data of 3848 e-taxis from June 1st to June 30th, 2017, coupled with 165 charging stations including 317 fast charging posts and 1421 slow charging posts. We adopt this simulator to generate 6 dynamic urban environments, which reflect the different real-world scenarios faced by e-taxi drivers. In all of these environments, we conduct extensive experiments to validate that the proposed MFHRL framework greatly outperforms all baselines by significantly increasing the rewards obtained by e-taxi drivers. Besides, we also show that the charging policy learned by MFHRL can effectively reduce the range anxiety of e-taxi drivers, which significantly boosts e-taxi drivers' quality of experience.

What problem does this paper attempt to address?

### Problems the paper attempts to solve This paper aims to solve the problems existing in the charging and re - positioning decision - making of electric taxis (e - taxis). Specifically, although some existing re - positioning recommendation schemes are effective for traditional gasoline taxis, they are not ideal for the user experience of electric taxi drivers. The main problems include: 1. **Long charging time**: - Existing re - positioning recommendation schemes do not consider the refueling decision of taxis because the refueling time of gasoline taxis is usually very short and can be ignored. However, the charging time of electric taxis at charging stations may be as long as several hours. If electric taxis continuously follow the existing re - positioning suggestions, their batteries may be quickly depleted and need a long - time charge, thus missing many order - receiving opportunities. 2. **Sparse and uneven distribution of charging stations**: - Charging stations are usually sparsely and unevenly distributed in the city. If charging opportunities are not considered, existing re - positioning recommendation schemes may lead electric taxis to areas without charging stations around, even if their battery power is already very low. In order to optimize the user experience of electric taxi drivers, this paper designs a combined charging and re - positioning recommendation system (CARE). This system models the decision - making problem as a multi - agent reinforcement learning problem from the perspective of electric taxi drivers, and the goal of each electric taxi driver is to maximize their cumulative rewards. Specifically, this paper proposes a new multi - agent mean - field hierarchical reinforcement learning (MFHRL) framework. Through this framework, CARE can provide far - sighted charging and re - positioning suggestions for electric taxi drivers. ### Main contributions 1. **Designed the combined charging and re - positioning recommendation system for the first time**: - This paper designs a combined charging and re - positioning recommendation system (CARE) for the first time to coordinate the charging and re - positioning decisions of thousands of electric taxis in urban - scale areas. Based on this idea, this paper models the problem as a multi - agent reinforcement learning problem from the perspective of electric taxi drivers, and the goal of each electric taxi driver is to maximize their long - term rewards. 2. **Proposed the multi - agent mean - field hierarchical reinforcement learning framework**: - In order to solve the above problems, this paper proposes a new multi - agent mean - field hierarchical reinforcement learning (MFHRL) framework. The hierarchical architecture of MFHRL can set goals for agents, enabling them to effectively learn far - sighted charging and re - positioning decisions. In addition, this paper integrates the mean - field approximation in two hierarchical levels respectively to consider the mutual influence between agents. As far as we know, MFHRL is the first multi - agent reinforcement learning framework that combines hierarchical reinforcement learning with mean - field approximation. 3. **Experimental verification based on large - scale real - world data sets**: - This paper conducts research based on a large - scale real - world data set containing 3,848 electric taxis, about 168,000 orders per day, and 164 charging stations (including 317 fast - charging piles and 1421 slow - charging piles). Based on this data set, this paper builds an electric taxi simulator and conducts extensive experiments. The results show that the proposed MFHRL framework significantly outperforms all baseline methods. ### Data - driven analysis Through the analysis of the GPS trajectories, transaction data and charging station data of electric taxis in Shenzhen from June 1, 2017 to June 30, 2017, this paper reveals the charging problems faced by electric taxi drivers: 1. **Long charging time**: - 20% of electric taxi drivers need to spend more than 158 minutes on charging every day, which is nearly 20 times longer than the daily refueling time of gasoline taxis, resulting in missing many order - receiving opportunities. 2. **Sparse and uneven distribution of charging stations**: - 20% of electric taxi drivers need to drive more than 2.5 kilometers to reach the nearest charging station after completing an order, and in the suburbs, this distance may exceed 6 kilometers. 3. **Serious congestion at charging stations**: - Although there are 164 charging stations in Shenzhen, 80% of the charging activities are concentrated in only 37 charging stations. The number of charging piles in these charging stations is insufficient, resulting in a large number of electric taxi drivers queuing up to wait for charging and facing serious charging station congestion problems. In summary, this paper designs a recommendation system by integrating charging decisions and re - positioning recommendations, aiming to optimize the charging and re - positioning decisions of electric taxi drivers and improve them.

Joint Charging and Relocation Recommendation for E-Taxi Drivers via Multi-Agent Mean Field Hierarchical Reinforcement Learning

A Simulation–optimization Framework for a Dynamic Electric Ride-Hailing Sharing Problem with a Novel Charging Strategy

Spatial-temporal Pricing for Ride-Sourcing Platform with Reinforcement Learning

Optimizing Autonomous Electric Taxi Operations with Integrated Mobile Charging Services: an Approximate Dynamic Programming Approach

Multi-service Provision for Electric Vehicles in Power-Transportation Networks Towards a Low-Carbon Transition: A Hierarchical and Hybrid Multi-Agent Reinforcement Learning Approach

Optimizing Routing and Scheduling of Shared Autonomous Electric Taxis Considering Capacity Constrained Parking Facilities

Joint Scheduling of Charging and Service Operation of Electric Taxi Based on Reinforcement Learning

META: A City-Wide Taxi Repositioning Framework Based on Multi-Agent Reinforcement Learning

Dynamic Balancing-Charging Management for Shared Autonomous Electric Vehicle Systems: A Two-Stage Learning-Based Approach

A Clustering-Based Multi-Agent Reinforcement Learning Framework for Finer-Grained Taxi Dispatching

Data-Driven Fairness-Aware Vehicle Displacement for Large-Scale Electric Taxi Fleets

ForETaxi: Data-Driven Fleet-Oriented Charging Resource Allocation in Large-Scale Electric Taxi Networks

Intelligent Electric Vehicle Charging Recommendation Based on Multi-Agent Reinforcement Learning

Optimising Stochastic Routing for Taxi Fleets with Model Enhanced Reinforcement Learning

RLCharge: Imitative Multi-Agent Spatiotemporal Reinforcement Learning for Electric Vehicle Charging Station Recommendation

Online Operations of Automated Electric Taxi Fleets: an Advisor-Student Reinforcement Learning Framework

Optimize Taxi Driving Strategies Based On Reinforcement Learning

Real-Time Charging Station Recommendation System for Electric-Vehicle Taxis

Electric Taxi Charging Load Prediction Based on Trajectory Data and Reinforcement Learning—A Case Study of Shenzhen Municipality

Optimal Passenger-Seeking Policies on E-hailing Platforms Using Markov Decision Process and Imitation Learning