Abstract:The cognitive industrial Internet of Things (CIIoT) can improve transmission performance by utilizing the spectrum licensed to a primary user (PU), providing that the normal communication of the PU is not disturbed. However, the traditional spectrum access schemes for the CIIoT are difficult to adapt to the various communication environments. In this article, <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="1.838ex" height="2.509ex" style="vertical-align: -0.671ex;" viewBox="0 -791.3 791.5 1080.4" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMATHI-51" x="0" y="0"></use></g></svg></span>-learning-based dynamic spectrum access is proposed for the CIIoT to intelligently utilize the spectrum resources in three access scenarios: orthogonal multiple access (OMA), underlay spectrum access, and nonorthogonal multiple access (NOMA). In the OMA scheme, the CIIoT learns to access the idle channels to avoid distributing the PUs, but its communication continuity cannot be guaranteed when most of the channels are occupied by the PUs. In the underlay scheme, the CIIoT learns to utilize the busy channels to ensure the communication continuity by limiting its transmit power within the tolerance of the PU. However, the interference to the PU cannot be eliminated, which will decrease the PU's throughput. In the NOMA scheme, however, the CIIoT can utilize the busy channels by canceling the interference to the PU with successive interference cancellation, which will guarantee the transmission performance of both the CIIoT and the PU. A <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="1.838ex" height="2.509ex" style="vertical-align: -0.671ex;" viewBox="0 -791.3 791.5 1080.4" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMATHI-51" x="0" y="0"></use></g></svg></span>-learning-based spectrum access algorithm is proposed to improve the transmission performance of the CIIoT in the three schemes. The simulation results have shown the advantages of the <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="1.838ex" height="2.509ex" style="vertical-align: -0.671ex;" viewBox="0 -791.3 791.5 1080.4" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMATHI-51" x="0" y="0"></use></g></svg></span>-learning-based NOMA scheme in terms of guaranteeing the throughput of the CIIoT nodes and decreasing the interference to the PUs.<svg xmlns="http://www.w3.org/2000/svg" style="display: none;"><defs id="MathJax_SVG_glyphs"><path stroke-width="1" id="MJMATHI-51" d="M399 -80Q399 -47 400 -30T402 -11V-7L387 -11Q341 -22 303 -22Q208 -22 138 35T51 201Q50 209 50 244Q50 346 98 438T227 601Q351 704 476 704Q514 704 524 703Q621 689 680 617T740 435Q740 255 592 107Q529 47 461 16L444 8V3Q444 2 449 -24T470 -66T516 -82Q551 -82 583 -60T625 -3Q631 11 638 11Q647 11 649 2Q649 -6 639 -34T611 -100T557 -165T481 -194Q399 -194 399 -87V-80ZM636 468Q636 523 621 564T580 625T530 655T477 665Q429 665 379 640Q277 591 215 464T153 216Q153 110 207 59Q231 38 236 38V46Q236 86 269 120T347 155Q372 155 390 144T417 114T429 82T435 55L448 64Q512 108 557 185T619 334T636 468ZM314 18Q362 18 404 39L403 49Q399 104 366 115Q354 117 347 117Q344 117 341 117T337 118Q317 118 296 98T274 52Q274 18 314 18Z"></path></defs></svg>

Dynamic Spectrum Access for D2D-Enabled Internet-of-Things: A Deep Reinforcement Learning Approach

Deep Reinforcement Learning-Based Dynamic Spectrum Access for D2D Communication Underlay Cellular Networks

A deep reinforcement learning-based D2D spectrum allocation underlaying a cellular network

Deep Reinforcement Learning Based Massive Access Management for Ultra-Reliable Low-Latency Communications

Hybrid Centralized-Distributed Resource Allocation Based on Deep Reinforcement Learning for Cooperative D2D Communications

Dynamic Spectrum Access for Ambient Backscatter Communication-assisted D2D Systems with Quantum Reinforcement Learning

Dynamic multiple access based on deep reinforcement learning for Internet of Things

RDRL: A Recurrent Deep Reinforcement Learning Scheme for Dynamic Spectrum Access in Reconfigurable Wireless Networks

Cooperative Multi-Agent Reinforcement Learning Based Distributed Dynamic Spectrum Access in Cognitive Radio Networks

Dynamic Resource Allocation for Device‐to‐Device Communication Underlaying Cellular Networks

Machine Learning-Based Resource Optimization for D2D Communication Underlaying Networks

Reinforcement-Learning-Based Dynamic Spectrum Access for Software-Defined Cognitive Industrial Internet of Things

Deep Reinforcement Learning for Joint Channel Selection and Power Control in D2D Networks

Dynamic Spectrum Sharing Based on Deep Reinforcement Learning in Mobile Communication Systems

Distributive Dynamic Spectrum Access through Deep Reinforcement Learning: A Reservoir Computing Based Approach

Intelligent Access to Unlicensed Spectrum: A Mean Field Based Deep Reinforcement Learning Approach

Double Deep Q-Network Based Distributed Resource Matching Algorithm for D2D Communication

Traffic Priority-Aware Multi-User Distributed Dynamic Spectrum Access: A Multi-Agent Deep RL Approach

GRLinQ: An Intelligent Spectrum Sharing Mechanism for Device-to-Device Communications with Graph Reinforcement Learning

Deep Multiagent Reinforcement-Learning-Based Resource Allocation for Internet of Controllable Things

Dynamic Spectrum Access for C-V2X via Imitating Indian Buffet Process