Abstract:The Intelligent Transportation System (ITS) environment is known to be dynamic and distributed, where participants (vehicle users, operators, etc.) have multiple, changing and possibly conflicting objectives. Although Reinforcement Learning (RL) algorithms are commonly applied to optimize ITS applications such as resource management and offloading, most RL algorithms focus on single objectives. In many situations, converting a multi-objective problem into a single-objective one is impossible, intractable or insufficient, making such RL algorithms inapplicable. We propose a multi-objective, multi-agent reinforcement learning (MARL) algorithm with high learning efficiency and low computational requirements, which automatically triggers adaptive few-shot learning in a dynamic, distributed and noisy environment with sparse and delayed reward. We test our algorithm in an ITS environment with edge cloud computing. Empirical results show that the algorithm is quick to adapt to new environments and performs better in all individual and system metrics compared to the state-of-the-art benchmark. Our algorithm also addresses various practical concerns with its modularized and asynchronous online training method. In addition to the cloud simulation, we test our algorithm on a single-board computer and show that it can make inference in 6 milliseconds.

What problem does this paper attempt to address?

This paper attempts to solve the problem of multi - objective optimization in intelligent transportation systems (ITS), especially in distributed, non - stationary and adversarial environments. Specifically, the paper mainly addresses the following aspects: 1. **Complexity of multi - objective problems**: In intelligent transportation systems, participants (such as vehicle users, operators, etc.) usually have multiple, changing and potentially conflicting goals. Most existing reinforcement learning (RL) algorithms can only handle single - objective problems, and simplifying multi - objective problems into single - objective problems is usually infeasible, intractable or insufficient. 2. **Adaptability in dynamic environments**: The environment of intelligent transportation systems is dynamically changing, with distribution and noise characteristics, and the reward signal is sparse and delayed. Existing RL algorithms perform poorly in such environments, especially under frequently changing combined goals and preferences. 3. **Computational efficiency and resource utilization**: Existing multi - objective RL algorithms often require high computational costs and are difficult to achieve efficient online retraining in practical applications. Therefore, it is very necessary to design an efficient, low - computational - requirement multi - objective multi - agent reinforcement learning (MARL) algorithm. ### Main contributions of the paper 1. **Propose for the first time a multi - objective MARL algorithm suitable for distributed, non - stationary environments**: This algorithm can optimize in frequently changing combinations of goals and preferences. 2. **Efficient online retraining**: By offline training an initially optimal model and then deploying it to each independent agent (representing vehicle users), these agents can update their offloading strategies through online few - shot learning without prior knowledge of the reward shape, reducing the retraining cost. 3. **Modular and asynchronous training**: The algorithm can be modularized and trained asynchronously, improving flexibility and scalability. Experiments show that this algorithm outperforms existing benchmark algorithms in all individual and system indicators, and can also improve the underlying resource efficiency in heterogeneous environments, making other algorithms benefit from the improved offloading rate and fairness. 4. **Real - time inference performance**: Tests on single - board computers show that this algorithm can complete inference within 6 milliseconds, meeting the real - time requirements. 5. **Publish code and data**: To promote research and application, the authors provide publicly accessible code and data. ### Summary This paper aims to solve the challenges of multi - objective optimization in intelligent transportation systems, especially in dynamic, distributed and non - stationary environments. By proposing an efficient multi - objective multi - agent reinforcement learning algorithm, this research not only improves the performance of resource allocation and offloading decisions, but also provides a feasible solution for practical applications.

Multi-Objective Optimization Using Adaptive Distributed Reinforcement Learning

Multi-Objective Optimization Using Adaptive Distributed Reinforcement Learning

Target-Value-Competition-Based Multi-Agent Deep Reinforcement Learning Algorithm for Distributed Nonconvex Economic Dispatch

Multiagent Reinforcement Learning for Strictly Constrained Tasks Based on Reward Recorder

Observer-Based Multiagent Deep Reinforcement Learning: A Fully Distributed Training Scheme

Collaborative multi-agents in dynamic industrial internet of things using deep reinforcement learning

Adaptive Individual Q-Learning-A Multiagent Reinforcement Learning Method for Coordination Optimization

Multi-Agent Reinforcement Learning-Based Decision Making for Twin-Vehicles Cooperative Driving in Stochastic Dynamic Highway Environments

Distributed Adaptive Reinforcement Learning: A Method for Optimal Routing

Scalable Model-based Policy Optimization for Decentralized Networked Systems

Autonomous Intersection Management with Heterogeneous Vehicles: A Multi-Agent Reinforcement Learning Approach

D-HAL: Distributed Hierarchical Adversarial Learning for Multi-Agent Interaction in Autonomous Intersection Management

Multi-Agent Deep Reinforcement Learning for Large-scale Traffic Signal Control

Hybrid Information-driven Multi-agent Reinforcement Learning

Multi-Agent Reinforcement Learning for Traffic Flow Management of Autonomous Vehicles

Real-Time Multi-Vehicle Scheduling in Tasks With Dependency Relationships Using Multi-Agent Reinforcement Learning

Multi-Agent Reinforcement Learning for Autonomous Driving: A Survey

Multi-agent deep reinforcement learning with centralized training and decentralized execution for transportation infrastructure management

Multi-Agent Reinforcement Learning for Long-Term Network Resource Allocation through Auction: a V2X Application

Multi-agent reinforcement learning for intelligent resource allocation in IIoT networks

Deep Multi-agent Reinforcement Learning for Highway On-Ramp Merging in Mixed Traffic