Off-Policy - Soft Actor-Critic-based Adaptive Streaming for 360-degree Video in Heterogeneous Wireless Networks.

Chen Jiang,Feng Chen,Pingping Chen
DOI: https://doi.org/10.1109/WCSP52459.2021.9613395
2021-01-01
Abstract:Poor data-efficiency and delayed reward limit the implementation of deep reinforcement learning (DRL) approach for 360-degree video streaming in heterogeneous wireless networks. In this paper, we leverage DRL-based approach and priority-based multi-path communication for 360-degree video streaming. First, we adopt soft actor-critic for discrete action (SAC-D) to decide network utilization ratio of each link, which features a state-of-art off-policy learning on the far future decisions for multiple paths. Second, we propose a priority-aware frame scheduling to further maximize video quality. The scheduling can order SVC bit stream to flexibly utilize the spatial and quality characteristics of tile-based approach and then reduce computational complexity. Finally, we evaluate the proposed scheme on a semi-physical test platform. Experience results show that our algorithm significantly outperforms the comparison algorithm in terms of overall QoE quality and base layer (BL) freeze ratio.
What problem does this paper attempt to address?