More is Less: Domain-Specific Speech Recognition Microprocessor Using One-Dimensional Convolutional Recurrent Neural Network

Bo Liu,Hao Cai,Zilong Zhang,Xiaoling Ding,Ziyu Wang,Yu Gong,Weiqiang Liu,Jinjiang Yang,Zhen Wang,Jun Yang
DOI: https://doi.org/10.1109/tcsi.2021.3134271
2022-04-01
Abstract:Low-power keywords recognition has been a focus of acoustic signal processing for several decades. This work investigates the domain-specific speech recognition microprocessor based on optimized one-dimensional convolutional recurrent neural network (1D-CRNN). Compared to previous DNN based frameworks, the proposed 1D-CRNN can process both the feature extraction and keywords classification, and achieve high recognition accuracy with reduced computation operations under wide range background noise SNRs. An energy-efficient 1D-CRNN accelerator is implemented to dynamically reconfigure and process the different layers. This accelerator has the characteristics of “More is Less” in three aspects: 1) the hybrid network with more complex layers is much more compact and requires less computation; 2) although the weight width quantized to 8 bits requires more memory size and multiplication energy cost, the required network neurons can be reduced and hardware utilization can be improved; 3) an energy-aware self-compensation tensor multiplication unit with dual power supply based on approximation design method can be utilized for 1D-CRNN computing. Compared to the state-of-the-art architectures, the novel more-is-less architecture can achieve a much lower power consumption of $1.4~\mu \text{W}\sim 2.1~\mu \text{W}$ (over 80% reduced) under an industry 22nm technology, while maintaining higher system adaptability (support SNRs: −5dB~Clean) for 1~5 real-time keywords recognition.
What problem does this paper attempt to address?