Federated Policy Distillation for Digital Twin-Enabled Intelligent Resource Trading in 5G Network Slicing

Daniel Ayepah Mensah,Guolin Sun,Gordon Owusu Boateng,Guisong Liu
DOI: https://doi.org/10.1109/tnsm.2024.3476480
2024-01-01
IEEE Transactions on Network and Service Management
Abstract:Resource sharing in radio access networks (RAN) can be conceptualized as a resource trading process between infrastructure providers (InPs) and multiple mobile virtual network operators (MVNO), where InPs lease essential network resources, such as spectrum and infrastructure, to MVNOs. Given the dynamic nature of RANs, deep reinforcement learning (DRL) is a more suitable approach to decision-making and resource optimization that ensures adaptive and efficient resource allocation strategies. In RAN slicing, DRL struggles due to imbalanced data distribution and reliance on high-quality training data. In addition, the trade-off between the global solution and individual agent goals can lead to oscillatory behavior, preventing convergence to an optimal solution. Therefore, we propose a collaborative intelligent resource trading framework with a graph-based digital twin (DT) for multiple InPs and MVNOs based on Federated DRL. First, we present a customized mutual policy distillation scheme for resource trading, where complex MVNO teacher policies are distilled into InP student models and vice versa. This mutual distillation encourages collaboration to achieve personalized resource trading decisions that reach the optimal local and global solution. Second, the DT uses a graph-based model to capture the dynamic interactions between InPs and MVNOs to improve resource-trade decisions. DT can accurately predict resource prices and demand from MVNO to provide high-quality training data. In addition, DT identifies the underlying patterns and trends through advanced analytics, enabling proactive resource allocation and pricing strategies. The simulation results and analysis confirm the effectiveness and robustness of the proposed framework to an unbalanced data distribution.
What problem does this paper attempt to address?