L-MPC: A LUT based MuIti-LeveI Prediction-Correction Architecture for Accelerating Binary-Weight Hourglass Network

Hong Liu,Leibo Liu,Wenping Zhu,Qiang Li,Huiyu Mo,Shaojun Wei
2019-01-01
Abstract:A binary-weight hourglass network (B-HG) accelerator for landmark detection, built on the proposed look-up-table (LUT) based multi-level prediction-correction approach, is enabled for high-speed and energy-efficient processing on IoT edge devices. First, LUT with a unified mode is adopted to support convolutional neural network with fully variable weight bit precision to minimize operations of B-HG, which achieves $1.33\times-1.50\times$ speedup on multi-bit weight CNN relative to the similar solution. Second, multi-level prediction-correction model is proposed to achieve computational-efficient convolution with adaptive precision. The operations saved can be increase by about 30% than the two-stage model. Besides, nearly 77.4% of the operations in B-HG can be saved by using the combination of these two methods, yielding a 2.3× inference speedup. Third, block computing based pipeline is designed to improve the residual block deficiency in B-HG. It can not only reduce about 66.2% off-chip memory access than the baseline, but also save 60% and 31% on-chip memory space and access compared to the similar fused-layer accelerator. The proposed B-HG accelerator achieves 450 fps at 500MHz based on the simulation in TSMC 28 nm process. Meanwhile, the power efficiency is up to 8.5 TOPS/W, which is two orders of magnitude higher than the dedicated face landmark detection accelerator.
What problem does this paper attempt to address?