Self-Attention DDPG for Multi-Beam Combining in Mmwave MIMO Systems.

Yingzhi Huang,Zhaoyang Zhang,Zhaohui Yang,Qianqian Yang
DOI: https://doi.org/10.1109/pimrc54779.2022.9977701
2022-01-01
Abstract:In this paper, we aim at an efficient multi-beam combining design with only requiring receive power measurements for a millimeter-wave (mmWave) multi-input multi-output (MIMO) communication system. A spectrum efficiency maximization problem is formulated with both beam selection and power constraints. To solve this problem, a reinforcement learning (RL)-based multi-beam combining algorithm is proposed. In particular, a self-attention deep deterministic policy gradient (DDPG) scheme is used to adaptively learn the serving beam sets and the corresponding combining weights without any channel state information (CSI). Moreover, the transformer is integrated into the DDPG to precisely capture the signal directions and relevant strengths. Experimental results show the effectiveness of the proposed learning structure in terms of system achievable rate, convergence, and network robustness.
What problem does this paper attempt to address?