Layer-Wise Mixed-Modes CNN Processing Architecture With Double-Stationary Dataflow and Dimension-Reshape Strategy

Bo Liu,Xinxiang Huang,Yang Zhang,Guang Yang,Han Yan,Chen Zhang,Zejv Li,Yuanhao Wang,Hao Cai
DOI: https://doi.org/10.1109/tcsi.2024.3434706
2024-10-04
IEEE Transactions on Circuits and Systems I Regular Papers
Abstract:With the development of convolutional neural networks (CNN) across various domains, the growth in network structure complexity and computational load has increasingly become a research focus in the deployment of neural networks. The key to current research on neural network accelerators lies in striking a balance between computational accuracy and energy efficiency. This paper proposes a software-hardware co-design to strike the balance for CNN edge applications. On the hardware side, a 3-dimensional tensor engine (3D-TE), achieved with reconfigurable Tensor Processing Units (TPUs), is introduced for efficient convolution computation. We optimize the CNN dataflow on 3D-TE using a dimension reshaping method for feature maps rearrangement, and a double stationary dataflow scheduling to reduce memory access. This paper adopts a configurable approximate multiplier design based on Boolean Matrix Factorization (BMF) based logic synthesis applied in the architecture of TPU. The proposed 3D-TE, characterized by its configurable precision, enables the TPUs to dynamically adapt the bitwidth of features and weights in response to varying precision requirements. On the software side, a hessian-guided layer precision mapping is adopted to reduce unnecessary computational overhead, and a progressive re-training approach is proposed to enable a better approximation configuration and higher power reduction. Fabricated on 28-nm CMOS, this work achieves an optimized energy efficiency of 14.9 TOPS/W and 12.1 TOPS/W for ResNet56 and MobileNetV2 respectively, with 0.6V supply voltage and 150MHz clock frequency, representing an improvement of over the state-of-the-art works.
engineering, electrical & electronic
What problem does this paper attempt to address?