An Energy-Efficient Coarse-Grained Reconfigurable Processing Unit for Multiple-Standard Video Decoding
Leibo Liu,Chenchen Deng,Dong Wang,Min Zhu,Shouyi Yin,Peng Cao,Shaojun Wei
DOI: https://doi.org/10.1109/tmm.2015.2463735
IF: 7.3
2015-01-01
IEEE Transactions on Multimedia
Abstract:In this paper, we introduce a coarse-grained dynamically reconfigurable fabric, named Reconfigurable Processing Unit (RPU), which is implemented on a 5.4×3.1 mm2 silicon with TSMC 65 nm LP1P8M technology. This fabric consists of 16×16 multi-functional Processing Elements (PEs) interconnected by an area-efficient Line-Switched Mesh Connect (LSMC) routing. A Hierarchical Configuration Context (HCC) organization scheme is proposed to reduce the scale of the context memory and enhance configuration efficiency. Two reconfigurable processors are then designed and fabricated to verify the proposed techniques. One processor (called REMUS_HPP) integrates two RPUs, targeting the high performance applications. REMUS_HPP could decode 1920×1080@30fps H.264 streams with 280mW under 200MHz, achieving a performance gain of 1.81x and a 14.3x energy efficiency improvement over XPP-III. The other processor (called REMUS_LPP) integrates only one RPU, targeting the low power applications. REMUS_LPP could decode 720×480@35fps H.264 streams with 24.81mW under 75MHz, achieving a 76% power reduction and a 3.96x energy efficiency improvement compared with ADRES. More importantly, RPU is not only limited to video decoding applications. It can also be used to process some other computation-intensive applications and the corresponding analysis is given in this paper as well.