A 3.8-Μw 10-Keyword Noise-Robust Keyword Spotting Processor Using Symmetric Compressed Ternary-Weight Neural Networks

Bo Liu,Na Xie,Renyuan Zhang,Haichuan Yang,Ziyu Wang,Deliang Fan,Zhen Wang,Weiqiang Liu,Hao Cai
DOI: https://doi.org/10.1109/ojsscs.2023.3312354
2023-01-01
IEEE Open Journal of the Solid-State Circuits Society
Abstract:A ternary-weight neural network (TWN) inspired keyword spotting (KWS) processor is proposed to support complicated and variable application scenarios. To achieve high-precision recognition of ten keywords under 5 dB~Clean wide range of background noises, a convolution neural network consists of four convolution layers and four fully connected layers, with modified sparsity-controllable truncated Gaussian approximation-based ternary-weight training is used. End-to-end optimization composed of three techniques is utilized: 1) the stage-by-stage bit-width selection algorithm to optimize the hardware overhead of FFT; 2) the lossy compressed TWN with symmetric kernel training (SKT) and dedicated internal data reuse computation flow; and 3) the error intercompensation approximate addition tree to reduce the computation overhead with marginal accuracy loss. Fabricated in an industrial 22-nm CMOS process, the processor realizes up to ten keywords in real-time recognition under 11 background noise types, with the accuracy of 90.6%@clean and 85.4%@5 dB. It consumes an average power of $3.8 ~\mu \text{W}$ at 250 kHz and the normalized energy efficiency is $2.79\times $ higher than state of the art.
What problem does this paper attempt to address?