An Energy-Efficient BNN Accelerator with Two-Stage Value Prediction for Sparse-Edge Gesture Recognition

Yongliang Zhang,Yitong Rong,Xuyang Duan,Zhen Yang,Qiang Li,Ziyu Guo,Xu Cheng,Xiaoyang Zeng,Jun Han
DOI: https://doi.org/10.1109/tcsi.2023.3320175
2024-01-01
Abstract:In recent years, natural, flexible, and contactless vision-based gesture recognition has received significant attention in human-computer interaction. However, employing convolutional neural networks (CNNs) for RGB or RGB-D gestures can result in excessive power consumption and poor energy efficiency, making them unsuitable for embedded systems. In this paper, we propose a lightweight sparse binarized neural network (sBNN) model for edge gesture recognition that achieves an accuracy of 89.43%-99.92% on four open-source gesture datasets with $\leq 20.26$ million operations (MOP) and $\leq 15.83$ -Kilobytes (KB) parameters. We find high channel-level sparsity in the activation maps of sBNN when edge gestures are used as inputs. The sparse activation maps have multiple identical activation vectors called sparse activation vectors (SAV), which lead to highly repeated calculations. In order to avoid this issue, we propose a two-stage value prediction approach to skip these calculations, achieving a speedup of 1.03x-1.83x. Moreover, to reduce on- chip memory, the compression technique is applied to the sparse activation maps, providing a compression rate of 1.72x-3.45x. Finally, we implement an energy-efficient sparse BNN accelerator (SBA) on an embedded field-programmable gate array (FPGA). The experimental results show that our SBA has a latency of 26.3-46.8- $\mu \text{s}$ , a power consumption of 0.807 W, and an energy efficiency of 536.22-952.70-GOPS/W at 50-MHz frequency. Our SBA offers lower latency, lower power consumption, and higher energy efficiency than previous state-of-the-art gesture recognition accelerators.
What problem does this paper attempt to address?