Non-Stationary Bandit Strategy for Rate Adaptation with Delayed Feedback

Yapeng Zhao,Hua Qian,Kai Kang,Yanliang Jin
DOI: https://doi.org/10.1109/access.2020.2988671
IF: 3.9
2020-01-01
IEEE Access
Abstract:Rate adaptation is an efficient mechanism to utilize the channel capacity by adjusting the modulation and coding scheme in a dynamic wireless environment. The channel feedback, such as acknowledgment/negative acknowledgment (ACK/NACK) messages or the channel measurement such as received signal strength indicator (RSSI) can be applied to the rate adaptation. Existing rate adaptation algorithms are mainly driven by heuristics. They can not achieve satisfactory transmission rates in the time-varying environment. In this paper, we focus on the rate adaptation problem in a time-division duplex (TDD) system. A multi-armed bandit (MAB) strategy is applied to learn the changes of the channel condition from both RSSI and ACK/NACK signals. A discounted upper confidence bound based rate adaptation (DUCB-RA) algorithm is proposed. We show that the performance of the proposed algorithm is converged to the optimal with mathematical proofs. Simulation results demonstrate that the proposed algorithm can adapt to the time-varying channel and achieve better transmission throughput compared to existing rate adaptation algorithms.
What problem does this paper attempt to address?