Meta-Critic Reinforcement Learning for Intelligent Omnidirectional Surface Assisted Multi-User Communications

Qinpei Luo,Zhu Han,Boya Di
DOI: https://doi.org/10.1109/twc.2024.3358372
IF: 10.4
2024-01-01
IEEE Transactions on Wireless Communications
Abstract:With the 5G systems being highly developed, the urge of the next generation networks is increasingly necessary, which demands extremely high data rates and low latency. As an emerging technology capable of reflecting and refracting the incident signals on both sides simultaneously, recently the intelligent omnidirectional surface (IOS) has been used to enhance the capacity of wireless networks. However, it is challenging to design an IOS-enabled beamforming scheme that can respond quickly in a varying mobile environment due to its high complexity. In this paper, we aim to maximize the sum rate in an IOS-aided multi-user system given dynamically changing channel states and user mobility. A novel meta-critic reinforcement learning framework named meta-critic deep deterministic policy gradient algorithm is proposed to design the IOS-enabled beamforming scheme. We propose a meta-critic network that can recognize the environment change and automatically perform the self-renewal of the learning model. A stochastic explore-and-reload procedure is also tailored to reduce the high-dimensional action space problem. Simulation results demonstrate that our proposed method outperforms other benchmarks including the state-of-the-art reinforcement learning method in both achievable sum rate and convergence speed.
telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?