Fast Spectrum Sharing in Vehicular Networks: A Meta Reinforcement Learning Approach

Kai Huang,Zezhou Luo,Le Liang,Shi Jin
DOI: https://doi.org/10.1109/vtc2022-fall57202.2022.10012705
2022-01-01
Abstract:In this paper, we investigate the resource allocation problem in a dynamic vehicular environment, where multiple vehicle-to-vehicle links attempt to reuse the spectrum of vehicle-to-infrastructure links. It is modeled as a deep reinforcement learning problem that is subject to proximal policy optimization. Training a well-performing policy usually requires a massive amount of interactions with the environment for a long time and thus is typically performed on a simulator. However, an agent well trained in a simulated environment may still fail when deployed in a live network, due to inevitable difference between the two environments, termed reality gap. We make preliminary efforts to address this issue by leveraging meta reinforcement learning that allows the learning agent to quickly adapt to a new environment with minimal interactions after being trained across a variety of similar tasks. We demonstrate that only a few episodes are required for the meta trained policy to adapt to a new environment and the proposed method is shown to achieve near-optimal performance and exhibit rapid convergence.
What problem does this paper attempt to address?