Abstract:Energy harvesting (EH) is a promising technique to extend the lifetime of internet of things (IoT) networks. In real cases however, its costly and impractical that each node send its data to base station each slot, thus we decided to sample the data of some nodes and reconstructed the state of all nodes through the belief state in order to reduce energy cost. In this paper, we consider a dynamic multiple channel access problem in a single cell. There are multiple nodes and each of them is equipped with an EH device and a rechargeable battery. Particularly, we consider that a BS only observes the power information of scheduled nodes. Thus the whole problem is modeled as a partially observable Markov decision process (POMDP). We firstly convert the observation of partial nodes to the belief state of all nodes is proposed to predict the scheduling policy of the next time slot. And then we propose a deep reinforcement learning algorithm called Double deep Q network (Double DQN). Simulation results has clearly indicated that the performance of our proposed Double DQN outperforms that of other reinforcement learning (RL) algorithms.

Partially Observable Double DQN Based IoT Scheduling for Energy Harvesting.