On the Optimal Encoding Ladder of Tiled 360° Videos for Head-Mounted Virtual Reality
Ching-Ling Fan,Shou-Cheng Yen,Chun-Ying Huang,Cheng-Hsin Hsu
DOI: https://doi.org/10.1109/tcsvt.2020.3007288
IF: 5.859
2021-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Dynamic Adaptive Streaming over HTTP (DASH) has been widely used by several popular streaming services, such as YouTube, Netflix, and Facebook. Adopting DASH requires to pre-determine a set of encoding configurations, called encoding ladder, to generate a set of representations stored on the streaming server. These representations are adaptively requested by clients according to their network conditions during streaming sessions. In this article, we aim to solve the optimal laddering problem that determines the optimal encoding ladder to maximize the client viewing quality. In particular, we consider video models, viewing probability, and client distribution to formulate the mathematical problem. We use a divide-and-conquer approach to decompose the problem into two subproblems: (i) per-class optimization for clients with different bandwidths and (ii) global optimization to maximize the overall viewing quality under the storage limit of the streaming server. We propose two algorithms for each of the per-class optimization and global optimization problems. Analytical analysis and real experiments are conducted to evaluate the performance of our proposed algorithms, compared to other state-of-the-art algorithms. Based on the results, we recommend a combination of the proposed algorithms to solve the optimal laddering problem. The evaluation results show the merits of our recommended algorithms, which: (i) outperform the state-of-the-art algorithms by up to 52.17 and 26.35 in Viewport Video Multi-Method Assessment Fusion (V-VMAF) in per-class optimization, (ii) outperform the state-of-the-art algorithms by up to 43.14 in V-VMAF for optimal laddering in global optimization, (iii) achieve good scalability under different storage limits and number of bandwidth classes, and (iv) run faster than the state-of-the-art algorithms.