Abstract:In this paper, we study the design of a distributive delay-optimal cross-layer scheduling algorithm for two-hop relay communication systems over frequency selective fading channels. The complex interactions of the queues at the source node and the M relays (RSs) are modeled as an infinite horizon average reward Markov Decision Process (MDP), whose state space involves the joint queue state (QSI) of the queue at the source node and the queues at the M RSs as well as the joint channel state (CSI) of all S-R links and R-D links. As a first step to address the curse of dimensionality, we propose a reduced state MDP formulation. From the associated Bellman’s equation, we show that the delay-optimal power control (and link selection algorithm), which are functions of both the CSI and QSI, has a multi-level water-filling structure. Furthermore, using stochastic learning, we derive a distributive online learning algorithm in which each node recursively estimates a per-node potential function based on real-time observations of the local CSI and local QSI only. Based on the real-time local potential estimates and using approximate MDP, we propose an auction-based algorithm for link selection and show that the combined distributive learning converges almost surely to a global optimal solution for large arrivals. The proposed online learning algorithm is different from the conventional online learning algorithms in two ways: (1) our online iterative solution updates both the value function (potential) and the Lagrange multipliers (LM) simultaneously; and (2) we establish the technical conditions for the September 10, 2009 DRAFT September 10, 2009 2 almost sure convergence even the per-node potential update equation is no longer a contraction mapping and the existing convergence results (based on contraction mapping) cannot be applied directly to our distributive stochastic learning algorithm. Finally, we show by simulation that the delay performance of the proposed scheme is significantly better than various baselines such as the conventional CSIT-only control and the throughput optimal control (in stability sense).

Delay-Optimal Two-Hop Cooperative Relay Communications via Approximate MDP and Distributive Stochastic Learning

Delay-Aware Two-Hop Cooperative Relay Communications Via Approximate MDP and Stochastic Learning

Delay optimal power control and relay selection for two-hop cooperative OFDM systems via distributive stochastic learning

Queue-Aware Distributive Resource Control for Delay-Sensitive Two-Hop MIMO Cooperative Systems

Distributive Stochastic Learning for Delay-Optimal OFDMA Power and Subband Allocation

Stochastic Optimization for Joint Resource Allocation in OFDMA-Based Relay System

Delay-Aware Massive Random Access for Machine-Type Communications Via Hierarchical Stochastic Learning

Delay-Optimal User Scheduling and Inter-Cell Interference Management in Cellular Network via Distributive Stochastic Learning

Generalized two-hop relay for flexible delay control in MANETs

Partial Channel State Information Based Cooperative Relaying And Partner Selection

Stochastic Throughput Optimization for Two-Hop Systems with Finite Relay Buffers

Dynamic Partial Cooperative MIMO System for Delay-Sensitive Applications with Limited Backhaul Capacity

Distributive Subband Allocation, Power and Rate Control for Relay-Assisted OFDMA Cellular System with Imperfect System State Knowledge

Throughput-efficient Online Relay Selection for Dual-hop Cooperative Networks.

Delay Optimal Scheduling for Cognitive Radios with Cooperative Beamforming: A Structured Matrix-Geometric Method

Balancing Performance and Cost for Two-Hop Cooperative Communications: Stackelberg Game and Distributed Multi-Agent Reinforcement Learning

Delay-Aware Online Service Scheduling in High-Speed Railway Communication Systems

Distributed Stochastic Cross-Layer Optimization for Multi-Hop Wireless Networks With Cooperative Communications

Novel Deep Reinforcement Learning‐based Delay‐constrained Buffer‐aided Relay Selection in Cognitive Cooperative Networks

Delay Optimal Scheduling for Cognitive Radio Networks with Cooperative Beamforming

Queuing Analyses and Statistically Bounded Delay Control for Two-Hop Green Wireless Relay Transmissions.