Adaptive Video Streaming Based on Learning Intrinsic Reward

Yining Feng,Ying Wang,Hongyang Liu,Lin Cong,Yan Liu
DOI: https://doi.org/10.1109/bmsb55706.2022.9828616
2022-01-01
Abstract:The adaptive bitrate (ABR) algorithm based on reinforcement learning (RL) can actively learn bitrate control policies and adapt to different network environments. However, the user quality of experience (QoE) as the optimization objective contains multiple indicators. In the case of complex task objective, whether the reward feedback from the environment can effectively guide the policy update is the challenge of RL-based ABR algorithm. We propose an ABR algorithm based on learning intrinsic reward, which encourages exploration by enhancing the agent's intrinsic motivation and enables the agent to understand transitions between states to make an effective policy. The results show that our approach improves the average QoE value.
What problem does this paper attempt to address?