Abstract:In the Energy-Harvesting (EH) Cognitive Internet of Things (EH-CIoT) network, due to the broadcast nature of wireless communication, the EH-CIoT network is susceptible to jamming attacks, which leads to a serious decrease in throughput. Therefore, this paper investigates an anti-jamming resource-allocation method, aiming to maximize the Long-Term Throughput (LTT) of the EH-CIoT network. Specifically, the resource-allocation problem is modeled as a Markov Decision Process (MDP) without prior knowledge. On this basis, this paper carefully designs a two-dimensional reward function that includes throughput and energy rewards. On the one hand, the Agent Base Station (ABS) intuitively evaluates the effectiveness of its actions through throughput rewards to maximize the LTT. On the other hand, considering the EH characteristics and battery capacity limitations, this paper proposes energy rewards to guide the ABS to reasonably allocate channels for Secondary Users (SUs) with insufficient power to harvest more energy for transmission, which can indirectly improve the LTT. In the case where the activity states of Primary Users (PUs), channel information and the jamming strategies of the jammer are not available in advance, this paper proposes a Linearly Weighted Deep Deterministic Policy Gradient (LWDDPG) algorithm to maximize the LTT. The LWDDPG is extended from DDPG to adapt to the design of the two-dimensional reward function, which enables the ABS to reasonably allocate transmission channels, continuous power and work modes to the SUs, and to let the SUs not only transmit on unjammed channels, but also harvest more RF energy to supplement the battery power. Finally, the simulation results demonstrate the validity and superiority of the proposed method compared with traditional methods under multiple jamming attacks.

Joint EH Time and Transmit Power Optimization Based on DDPG for EH Communications

Power Allocation for Full-Duplex Communication Systems Based on Deep Deterministic Policy Gradient

Delay-Aware Power Control for Downlink Multi-User MIMO Via Constrained Deep Reinforcement Learning.

Energy-Efficient Resource Allocation in D2D Underlaid Cellular Uplinks.

Power Optimization in Device-to-Device Communications: A Deep Reinforcement Learning Approach with Dynamic Reward.

Energy Efficient Joint Resource Allocation and Power Control for D2D Communications

Deep Reinforcement Learning for Joint Channel Selection and Power Control in D2D Networks

Self-Attention DDPG for Multi-Beam Combining in Mmwave MIMO Systems.

Long-Term Throughput Maximization in Wireless Powered Communication Networks: A Multi-Task DRL Approach

Deep Reinforcement Learning-Assisted Energy Harvesting Wireless Networks

Deep Reinforcement Learning-based Power Control and Bandwidth Allocation Policy for Weighted Cost Minimization in Wireless Networks

Learning Deterministic Policy with Target for Power Control in Wireless Networks

Power Allocation in Multi-User Cellular Networks: Deep Reinforcement Learning Approaches

Hybrid Centralized-Distributed Resource Allocation Based on Deep Reinforcement Learning for Cooperative D2D Communications

Downlink Power Control for Cell-Free Massive MIMO with Deep Reinforcement Learning

Reinforcement Learning Approaches for IoT Networks with Energy Harvesting

Deep Reinforcement Learning for Computation and Communication Resource Allocation in Multiaccess MEC Assisted Railway IoT Networks

Anti-Jamming Resource-Allocation Method in the EH-CIoT Network through LWDDPG Algorithm

Reinforcement Learning-Based Resource Allocation and Energy Efficiency Optimization for a Space–Air–Ground-Integrated Network

Jointly Optimize Energy Harvest Time and Device Pairing for D2D Communications Underlaying Cellular Network

Energy-Efficient Joint Task Assignment and Power Control in Energy-Harvesting D2D Offloading Communications