Abstract:A multi-agent deep reinforcement learning (MADRL) is a promising approach to challenging problems in wireless environments involving multiple decision-makers (or actors) with high-dimensional continuous action space. In this paper, we present a MADRL-based approach that can jointly optimize precoders to achieve the outer-boundary, called pareto-boundary, of the achievable rate region for a multiple-input single-output (MISO) interference channel (IFC). In order to address two main challenges, namely, multiple actors (or agents) with partial observability and multi-dimensional continuous action space in MISO IFC setup, we adopt a multi-agent deep deterministic policy gradient (MA-DDPG) framework in which decentralized actors with partial observability can learn a multi-dimensional continuous policy in a centralized manner with the aid of shared critic with global information. Meanwhile, we will also address a phase ambiguity issue with the conventional complex baseband representation of signals widely used in radio communications. In order to mitigate the impact of phase ambiguity on training performance, we propose a training method, called phase ambiguity elimination (PAE), that leads to faster learning and better performance of MA-DDPG in wireless communication systems. The simulation results exhibit that MA-DDPG is capable of learning a near-optimal precoding strategy in a MISO IFC environment. To the best of our knowledge, this is the first work to demonstrate that the MA-DDPG framework can jointly optimize precoders to achieve the pareto-boundary of achievable rate region in a multi-cell multi-user multi-antenna system.

Optimal User Scheduling in Multi Antenna System Using Multi Agent Reinforcement Learning

Joint User Scheduling and Antenna Selection in Distributed Massive MIMO Systems with Limited Backhaul Capacity

Efficient user selection algorithms for multiuser MIMO systems with zero-forcing dirty paper coding.

Joint Spatial-Frequency Domain Message Passing Algorithm for Radio Resource Scheduling

A Deep Reinforcement Learning-Based Resource Scheduler for Massive MIMO Networks

Multiuser Scheduling for Minimizing Age of Information in Uplink MIMO Systems

Joint QoS-Aware Scheduling and Precoding for Massive MIMO Systems via Deep Reinforcement Learning

Deep Reinforcement Learning for Multi-user Massive MIMO with Channel Aging

A Scheduling Scheme for Improving the Performance and Security of MU-MIMO Systems

Reinforcement Learning Based Antenna Selection in User-Centric Massive MIMO

Joint Deep Reinforcement Learning and Unfolding: Beam Selection and Precoding for Mmwave Multiuser MIMO with Lens Arrays

Joint User Scheduling and Precoding for RIS-Aided MU-MISO Systems: A MADRL Approach

Joint Scheduling and ARQ for MU-MIMO Downlink in the Presence of Inter-Cell Interference

Multiuser MIMO Scheduling for Mobile Video Applications.

LoFi User Scheduling for Multiuser MIMO Wireless Systems

Deep learning based user scheduling for massive MIMO downlink system

Multi-agent deep reinforcement learning (MADRL) meets multi-user MIMO systems

Reinforcement Learning for Scheduling and Mimo beam Selection using Caviar Simulations

Low Complexity Scheduling Technique for Multi-User Mimo Systems

Spectrum-efficient user grouping and resource allocation based on deep reinforcement learning for mmWave massive MIMO-NOMA systems

An MRL-Based Design Solution for RIS-Assisted MU-MIMO Wireless System under Time-Varying Channels