Delay optimal power control and relay selection for two-hop cooperative OFDM systems via distributive stochastic learning

Rui Wang,Vincent K. N. Lau,Huang Huang
DOI: https://doi.org/10.1109/ISIT.2010.5513424
2010-01-01
Abstract:In this paper, we propose a distributive delay-optimal power and relay selection algorithm for two-hop cooperative OFDM systems. The complex interactions of the queues at the source node and the M relays (RSs) are modeled as an infinite horizon average reward Markov Decision Process (MDP), whose state space involves the joint queue state (QSI) of the queue at the source node and the queues at the M RSs as well as the joint channel state (CSI) of all S-R links and R-D links. As a first step to address the curse of dimensionality, we propose a reduced state MDP formulation. From the associated Bellman's equation, we show that the delay-optimal power control (and link selection algorithm), which are functions of both the CSI and QSI, has a multi-level water-filling structure. Furthermore, using stochastic learning, we derive a distributive online learning algorithm in which each node recursively estimates a per-node potential function based on real-time observations of the local CSI and local QSI only. We show that the combined distributive learning converges almost surely to a global optimal solution for large arrivals. Finally, we show by simulation that the delay performance of the proposed scheme is significantly better than various baselines such as the conventional CSIT-only control and the throughput optimal control (in stability sense).
What problem does this paper attempt to address?