Regret of Age-of-Information Bandits in Nonstationary Wireless Networks

Zaofa Song,Tao Yang,Xiaofeng Wu,Hui Feng,Bo Hu
DOI: https://doi.org/10.1109/LWC.2022.3205316
IF: 6.3
2022-01-01
IEEE Wireless Communications Letters
Abstract:We consider a wireless network with a source periodically generating time-sensitive information and transmitting it to a destination via one of N non-stationary orthogonal wireless channels. The goal of the scheduling policy is to keep the information at the destination fresh, which is captured by the Age of Information (AoI) metric. While obtaining an analytical and accurate AoI performance characterization in non-stationary wireless channels is usually intractable, we thereby resort to multi-armed bandits (MAB) to solve this problem, where the non-stationary channels and AoI as taken as arms and rewards, respectively. We consider three special non-stationary channels in which the lower bound on the AoI regret achievable by any policy is derived, respectively. In addition, the upper bound of Exp3.S, Active Arm Elimination (AAE) and Cumulative Sum Upper Confidence Bound (CUSUM-UCB) policy for the corresponding three settings are presented. Furthermore, the variants of AAE and CUSUM-UCB are proposed and verified more effectively than their original policies via simulations.
What problem does this paper attempt to address?