AoI-Aware Resource Scheduling for Industrial IoT with Deep Reinforcement Learning

Hongzhi Li,Lin Tang,Shengwei Chen,Libin Zheng,Shaohong Zhong
DOI: https://doi.org/10.3390/electronics13061104
IF: 2.9
2024-03-18
Electronics
Abstract:Effective resource scheduling methods in certain scenarios of Industrial Internet of Things are pivotal. In time-sensitive scenarios, Age of Information is a critical indicator for measuring the freshness of data. This paper considers a densely deployed time-sensitive Industrial Internet of Things scenario. The industrial wireless device transmits data packets to the base station with limited channel resources under the constraints of Age of Information. It is assumed that each device has the capacity to store the packets it generates. The device will discard the data to alleviate the data queue backlog when the Age of Information of the data packet exceeds the threshold. We developed a new system utility equation to represent the scheduling problem and the problem is expressed as a trade-off between minimizing the average Age of Information and maximizing network throughput. Inspired by the success of reinforcement learning in decision-processing problems, we attempt to obtain an optimal scheduling strategy via deep reinforcement learning. In addition, a reward function is constructed to enable the agent to achieve improved convergence results. Compared with the baseline, our proposed algorithm can achieve better system utility and lower Age of Information violation rate.
engineering, electrical & electronic,computer science, information systems,physics, applied
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper aims to address the issue of resource scheduling in Industrial Internet of Things (IIoT), particularly how to effectively manage limited channel resources in time-sensitive scenarios. Specifically, the paper focuses on how devices in a densely deployed IIoT environment can maximize network throughput while meeting data freshness requirements. **Main issues include:** 1. **Age of Information (AoI) Management**: - In time-sensitive applications, data freshness is a critical metric. Devices need to transmit data packets within limited channel resources, and if the AoI of a data packet exceeds a threshold, the device will discard the data to alleviate data queue backlog issues. - Optimize the average AoI and minimize the AoI violation rate to ensure system stability and data freshness. 2. **Resource Scheduling Optimization**: - When devices transmit data under limited channel resources, unreasonable scheduling strategies can lead to data queue backlogs, affecting service quality. - Using Deep Reinforcement Learning (DRL) methods to find the optimal resource scheduling strategy to balance data freshness and network throughput. 3. **System Utility Maximization**: - Propose a new system utility equation to represent the scheduling problem, framing it as a trade-off between minimizing average AoI and maximizing network throughput. - By constructing a reward function, enable the agent to achieve better convergence results, thereby improving the overall system utility. ### Solution The paper proposes a resource scheduling method based on Deep Reinforcement Learning (SDDQN algorithm), with the following specific steps: 1. **System Model**: - Establish a model of the IIoT communication system, including the data queue model and AoI queue model for devices. - Each device shares channel resources through TDMA, and the controller selects the appropriate device for data transmission based on the current state. 2. **MDP Modeling**: - Model the resource scheduling problem as a Markov Decision Process (MDP), defining the environment state, action set, state transition probability, and reward function. - The environment state includes data queue state, age queue state, and channel gain vector. - The action set includes decisions on whether each device should transmit data. - The reward function considers data transmission incentives, AoI penalties, discard penalties, and backlog penalties. 3. **SDDQN Algorithm**: - Improve the traditional DQN algorithm by proposing a hybrid exploration strategy and parameter update strategy. - Use target networks and evaluation networks for action selection and action evaluation, respectively, to reduce the overestimation problem of the action value function. - Ensure training stability and convergence speed through a combination of soft updates and hard updates. ### Experimental Results Through simulation experiments, the proposed algorithm's performance under different AoI constraints was verified. The experimental results show that compared to baseline methods, the proposed algorithm can achieve higher system utility and lower AoI violation rates, effectively solving the resource scheduling problem in IIoT.