A Deep Reinforcement Learning Scheme for Spectrum Sensing and Resource Allocation in ITS
Huang Wei,Yuyang Peng,Ming Yue,Jiale Long,Fawaz AL-Hazemi,Mohammad Meraj Mirza
DOI: https://doi.org/10.3390/math11163437
IF: 2.4
2023-08-09
Mathematics
Abstract:In recent years, the Internet of Vehicles (IoV) has been found to be of huge potential value in the promotion of the development of intelligent transportation systems (ITSs) and smart cities. However, the traditional scheme in IoV has difficulty in dealing with an uncertain environment, while reinforcement learning has the advantage of being able to deal with an uncertain environment. Spectrum resource allocation in IoV faces the uncertain environment in most cases. Therefore, this paper investigates the spectrum resource allocation problem by deep reinforcement learning after using spectrum sensing technology in the ITS, including the vehicle-to-infrastructure (V2I) link and the vehicle-to-vehicle (V2V) link. The spectrum resource allocation is modeled as a reinforcement learning-based multi-agent problem which is solved by using the soft actor critic (SAC) algorithm. Considered an agent, each V2V link interacts with the vehicle environment and makes a joint action. After that, each agent receives different observations as well as the same reward, and updates networks through the experiences from the memory. Therefore, during a certain time, each V2V link can optimize its spectrum allocation scheme to maximize the V2I capacity as well as increase the V2V payload delivery transmission rate. However, the number of SAC networks increases linearly as the number of V2V links increases, which means that the networks may have a problem in terms of convergence when there are an excessive number of V2V links. Consequently, a new algorithm, namely parameter sharing soft actor critic (PSSAC), is proposed to reduce the complexity for which the model is easier to converge. The simulation results show that both SAC and PSSAC can improve the V2I capacity and increase the V2V payload transmission success probability within a certain time. Specifically, these novel schemes have a 10 percent performance improvement compared with the existing scheme in the vehicular environment. Additionally, PSSAC has a lower complexity.
mathematics