Abstract:Resource allocation in Narrowband Internet of Things (NB-IoT) networks is a complex challenge due to dynamic user demands, variable channel conditions, and distance considerations. Traditional approaches often struggle to adapt to the dynamic nature of these environments. In this study, we leverage reinforcement learning (RL) to address the intricate nature of NB-IoT resource allocation. Specifically, we employ the Soft Actor-Critic (SAC) algorithm, comparing its performance against conventional RL algorithms such as Deep Q-Network (DQN) and Proximal Policy Optimization (PPO). The Soft Actor-Critic (SAC) algorithm is employed to train an agent for adaptive resource allocation, considering energy efficiency, throughput, latency, fairness, and interference constraints. The agent adeptly balances these objectives through an intricate reward structure and penalty mechanisms. Through comprehensive analysis, we present performance metrics, including total reward, energy efficiency, throughput, fairness, and latency, showcasing the efficacy of SAC when compared to DQN and PPO. Our findings underscore the efficiency of SAC in optimizing resource allocation in NB-IoT networks, offering a promising solution to the complexities inherent in such dynamic environments.Resource allocation in Narrowband Internet of Things (NB-IoT) networks presents a complex challenge due to dynamic user demands, variable channel conditions, and distance considerations. Traditional approaches often struggle to adapt to these dynamic environments. This study leverages reinforcement learning (RL), specifically the Soft Actor-Critic (SAC) algorithm, to address the intricacies of NB-IoT resource allocation. We compare SAC's performance against conventional RL algorithms, including Deep Q-Network (DQN) and Proximal Policy Optimization (PPO). The SAC algorithm is utilized to train an agent for adaptive resource allocation, focusing on energy efficiency, throughput, latency, fairness, interference constraints, recovery time, and long-term performance stability. To demonstrate the scalability and effectiveness of SAC, we conducted experiments on NB-IoT networks with varying deployment types and configurations, including standard urban and suburban, high-density urban, industrial IoT, rural and low-density, and IoT service providers. To assess generalization capability, we tested SAC across applications like smart metering, smart cities, smart agriculture, and asset tracking & management. Our comprehensive analysis demonstrates that SAC significantly outperforms DQN and PPO across multiple performance metrics. Specifically, SAC improves energy efficiency by 5.60% over PPO and 10.25% over DQN. In terms of latency, SAC achieves a marginal reduction of approximately 0.0124% compared to PPO and 0.0126% compared to DQN. SAC enhances throughput by 214.98% over PPO and 15.72% over DQN. Additionally, SAC shows a substantial increase in fairness (Jain's index), improving by 358.31% over PPO and 614.46% over DQN. SAC also demonstrates superior recovery time, improving by 18.99% over PPO and 25.07% over DQN. In both deployment scenarios and diverse IoT applications, SAC consistently achieves high total rewards, minimal fluctuations, and stable performance. Energy efficiency remains constant at 7.2 bits per Joule, and latency is approximately 0.080 s. Throughput is robust across different deployments, while fairness remains high, ensuring equitable resource allocation. Recovery times are stable, enhancing operational reliability. These results underscore SAC's efficiency and robustness in optimizing resource allocation in NB-IoT networks, presenting a promising solution to the complexities of dynamic environments.

An Actor-Critic Deep Reinforcement Learning Approach for Transmission Scheduling in Cognitive Internet of Things Systems

Deep Reinforcement Learning Based Massive Access Management for Ultra-Reliable Low-Latency Communications

Deep Reinforcement Learning for Multi-Functional RIS-Aided Over-the-Air Federated Learning in Internet of Robotic Things

Deep Reinforcement Learning Optimal Transmission Algorithm for Cognitive Internet of Things with RF Energy Harvesting

Reconfigurable Intelligent Surface-Assisted Aerial-Terrestrial Communications Via Multi-Task Learning

AoI-Aware Resource Scheduling for Industrial IoT with Deep Reinforcement Learning

Fairness-Aware Intelligent Multi-BD Scheduling in Symbiotic Radio Networks Using Soft Actor-Critic

Deep-Reinforcement-Learning-Based Scheduling with Contiguous Resource Allocation for Next-Generation Cellular Systems

Multi-user Resource Control with Deep Reinforcement Learning in IoT Edge Computing

Reinforcement Learning Based Congestion Control in a Real Environment.

Dynamic Resource Configuration for Low-Power IoT Networks: A Multi-Objective Reinforcement Learning Method

Deep Reinforcement Learning Enables Joint Trajectory and Communication in Internet of Robotic Things

ZiXia: A Reinforcement Learning Approach via Adjusted Ranking Reward for Internet Congestion Control

Actor-Critic Scheduling for Path-Aware Air-to-Ground Multipath Multimedia Delivery

Energy-delay-aware VNF scheduling: a reinforcement learning approach with hierarchical reward enhancement

Buffer-Aware Wireless Scheduling Based On Deep Reinforcement Learning

Next-gen resource optimization in NB-IoT networks: Harnessing soft actor-critic reinforcement learning

QoE-based Deep Reinforcement Learning for Resource Allocation in Real Time XR Video Transmission

A dynamic clustering technique based on deep reinforcement learning for Internet of vehicles

Cooperative Multi-Agent Actor–critic Control of Traffic Network Flow Based on Edge Computing

Scalable Deep Reinforcement Learning for Routing and Spectrum Access in Physical Layer