Pareto-Optimal Multi-Agent Cooperative Caching Relying on Multi-Policy Reinforcement Learning

Boyang Guo,Youjia Chen,Peng Cheng,Ming Ding,Jinsong Hu,Lajos Hanzo
DOI: https://doi.org/10.1109/jiot.2023.3317971
IF: 10.6
2023-01-01
IEEE Internet of Things Journal
Abstract:Given the popularity of flawless telepresence and the resultants explosive growth of wireless video applications, besides handling the traffic surge, satisfying the demanding user requirements for video qualities has become another important goal of network operators. Inspired by this, cooperative edge caching intrinsically amalgamated with scalable video coding is investigated. Explicitly, the concept of a Pareto-optimal semi-distributed multi-agent multi-policy deep reinforcement learning (SD-MAMP-DRL) algorithm is conceived for managing the cooperation of heterogeneous network nodes. To elaborate, a multi-policy reinforcement learning algorithm is proposed for finding the Pareto-optimal policies during the training phase, which balances the tele-traffic vs. the user experience trade-off. Then the optimal policy/solution can be activated during the execution phase by appropriately selecting the associated weighting coefficient according to the dynamically fluctuating network traffic load. Our experimental results show that the proposed SD-MAMP-DRL algorithm 1) achieves better performance than the benchmark algorithms; 2) obtains a near-complete Pareto-front in various scenarios and selects the optimal solution by adaptively adjusting the above-mentioned pair of objectives.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?