A Fast CABAC Hardware Design for Accelerating the Rate Estimation in HEVC
Yujie Cai,Yibo Fan,Leilei Huang,Xiaoyang Zeng,Haibing Yin,Bing Zeng
DOI: https://doi.org/10.1109/tcsvt.2021.3093579
IF: 5.859
2021-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:The latest High Efficiency Video Coding standard achieves twice the coding efficiency of the H264 standard through a complex rate-distortion optimization (RDO). The coded bit-streams are produced with context adaptive binary arithmetic coding (CABAC). CABAC itself is a very time-consuming process that includes binarization, context modeling, interval subdivision, renormalization, outstanding bit handling, and context updating. The aim of this research is to speed up the CABAC process through several simplifications. First, we approximate three parts of the CABAC, i.e., interval subdivision, renormalization, and outstanding bit handling, with a piecewise-linear function that is very friendly to hardware implementation. In order to achieve better hardware parallelism, we also improve the coding process at the sub-block level. The context of syntax elements in a sub-block is redistributed to skip the complex calculation of context indexing. We perform context updating at the granularity of sub-blocks so that the data dependency of the context updating is removed completely, and the original serial encoding process is changed to a parallel encoding process. At the same time, we make another simplification for the context modeling of cu_skip_flag. Based on these simplifications, we build a parallel hardware architecture for the rate estimation of the RDO process. This architecture completes the bit estimation of a $32\times 32$ coding tree unit (CTU) in 220.8 nano-seconds, whereas the Bjøntegaard Delta rate increases by only 2.225%. We believe that the proposed architecture can meet the requirements of 8K@120 fps ultra-high-definition videos. This is the first study to simplify the hardware design of rate estimation by changing the context allocation and updating rules.