Highly Paralleled Low-Cost Embedded HEVC Video Encoder on TI KeyStone Multicore DSP

Hongxu Jiang,Rui Fan,Yongfei Zhang,Gang Wang,Zhe Li
DOI: https://doi.org/10.1109/tcsvt.2018.2826074
IF: 5.859
2018-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Although HEVC, the emerging video coding standard, has doubled the coding performance of its predecessor H.264/AVC, its significantly increased computational complexity imposes great obstacles for HEVC encoders to be employed in real-time applications with embedded processors, such as digital signal processors (DSPs). In this paper, a TI Keystone multicore TMS320C6678 DSP-based highly paralleled low-cost fast HEVC encoding solution is well designed and implemented. First, the overall structure of HEVC encoder with CTU-level parallelism is re-designed to well support the encoding parallelism, with full consideration of the hardware characteristics. Second, a low-delay and low-memory multicore data transmission mechanism is proposed to reduce the latency of data access between internal L2 memory and external DDR3. Third, the encoding bottlenecks, i.e., the most time-consuming encoding modules, are identified and optimized for acceleration with TI powerful C6000 SIMD instructions. Experimental results show that our proposed HEVC encoder on TI TMS320C6678 DSPs can significantly improve the real-time capacity with tolerable performance loss, 0.93 dB performance loss under on average 465.50 times speedup as compared to CPU-based HM reference software, more specifically, which makes it desirable in power-constrained real-time video applications.
What problem does this paper attempt to address?