Pioneer: Offline Reinforcement Learning Based Bandwidth Estimation for Real-Time Communication.

Bingcong Lu,Keyu Wang,Jun Xu,Rong Xie,Li Song,Wenjun Zhang
DOI: https://doi.org/10.1145/3625468.3652174
2024-01-01
Abstract:For Real-time Communication (RTC), Bandwidth Estimation (BWE) is crucial for enhancing user Quality of Experience (QoE) by ensuring efficient bandwidth utilization and low latency. Recent advancements have shifted towards machine learning based algorithms, particularly online reinforcement leanring (RL), to dynamically infer future bandwidth using statistical data. However, challenges such as dependency on training settings, the necessity for extensive trial and error, and instability in complex state spaces hinder their efficacy. To address these limitations, we propose Pioneer, a novel offline RL framework for BWE in RTC systems. Unlike its predecessors, Pioneer eliminates the need for real-time environment interaction during training and achieves good performance through lightweight training. Our framework consists of a Trajectory Sampler for state information preprocessing and a Bandwidth Estimator based on offline RL model. Our test results on offline datasets show that Pioneer can achieve better performance than expert algorithms. We also tested Pioneer on online simulation platforms, and Pioneer can improve QoE by 9% compared to other offline algorithm, demonstrating good robustness.
What problem does this paper attempt to address?