Abstract:This paper studies the multi-agent resource allocation problem in vehicular networks using non-orthogonal multiple access (NOMA) and network slicing. Vehicles want to broadcast multiple packets with heterogeneous quality-of-service (QoS) requirements, such as safety-related packets (e.g., accident reports) that require very low latency communication, while raw sensor data sharing (e.g., high-definition map sharing) requires high-speed communication. To ensure heterogeneous service requirements for different packets, we propose a network slicing architecture. We focus on a non-cellular network scenario where vehicles communicate by the broadcast approach via the direct device-to-device interface (i.e., sidelink communication). In such a vehicular network, resource allocation among vehicles is very difficult, mainly due to (i) the rapid variation of wireless channels among highly mobile vehicles and (ii) the lack of a central coordination point. Thus, the possibility of acquiring instantaneous channel state information to perform centralized resource allocation is precluded. The resource allocation problem considered is therefore very complex. It includes not only the usual spectrum and power allocation, but also coverage selection (which target vehicles to broadcast to) and packet selection (which network slice to use). This problem must be solved jointly since selected packets can be overlaid using NOMA and therefore spectrum and power must be carefully allocated for better vehicle coverage. To do so, we first provide a mathematical programming formulation and a thorough NP-hardness analysis of the problem. Then, we model it as a multi-agent Markov decision process. Finally, to solve it efficiently, we use a deep reinforcement learning (DRL) approach and specifically propose a deep Q learning (DQL) algorithm. The proposed DQL algorithm is practical because it can be implemented in an online and distributed manner. It is based on a cooperative learning strategy in which all agents perceive a common reward and thus learn cooperatively and distributively to improve the resource allocation solution through offline training. We show that our approach is robust and efficient when faced with different variations of the network parameters and compared to centralized benchmarks.

Fast Spectrum Sharing in Vehicular Networks: A Meta Reinforcement Learning Approach

Meta Reinforcement Learning for Fast Spectrum Sharing in Vehicular Networks

Spectrum Sharing using Deep Reinforcement Learning in Vehicular Networks

A Hybrid Multi-Agent Reinforcement Learning Approach for Spectrum Sharing in Vehicular Networks

An approach to implement Reinforcement Learning for Heterogeneous Vehicular Networks

Multi-Agent Reinforcement Learning-Based Decentralized Spectrum Access in Vehicular Networks with Emergent Communication

Deep Learning based Wireless Resource Allocation with Application to Vehicular Networks

Meta-Reinforcement Learning Based Resource Allocation for Dynamic V2X Communications

Multi-Agent RL Enables Decentralized Spectrum Access in Vehicular Networks

Deep-Learning-Based Wireless Resource Allocation With Application to Vehicular Networks

Network slicing for vehicular communications: a multi-agent deep reinforcement learning approach

Meta Learning Based Adaptive Cooperative Perception in Nonstationary Vehicular Networks

A Deep Reinforcement Learning Scheme for Spectrum Sensing and Resource Allocation in ITS

Spectrum-Energy-Efficient Mode Selection and Resource Allocation for Heterogeneous V2X Networks: A Federated Multi-Agent Deep Reinforcement Learning Approach

Deep Reinforcement Learning-Based Spectrum Allocation Algorithm in Internet of Vehicles Discriminating Services

Learn to Allocate Resources in Vehicular Networks

Semantic-Aware Spectrum Sharing in Internet of Vehicles Based on Deep Reinforcement Learning

Transfer Learning in Multi-Agent Reinforcement Learning with Double Q-Networks for Distributed Resource Sharing in V2X Communication

Reinforcement Learning for Joint V2I Network Selection and Autonomous Driving Policies

Quick Learner Automated Vehicle Adapting its Roadmanship to Varying Traffic Cultures with Meta Reinforcement Learning

Platoon Leader Selection, User Association and Resource Allocation on a C-V2X based highway: A Reinforcement Learning Approach