NS-KWS: joint optimization of near-sensor processing architecture and low-precision GRU for always-on keyword spotting

Qin Li,Sheng Lin,Changlu Liu,Yidong Liu,Fei Qiao,Yanzhi Wang,Huazhong Yang
DOI: https://doi.org/10.1145/3370748.3407001
2020-01-01
Abstract:Keyword spotting (KWS) is a crucial front-end module in the whole speech interaction system. The always-on KWS module detects input words, then activates the energy-consuming complex backend system when keywords are detected. The performance of the KWS determines the standby performance of the whole system and the conventional KWS module encounters the power consumption bottleneck problem of the data conversion near the microphone sensor. In this paper, we propose an energy-efficient near-sensor processing architecture for always-on KWS, which could enhance continuous perception of the whole speech interaction system. By implementing the keyword detection in the analog domain after the microphone sensor, this architecture avoids energy-consuming data converter and achieves faster speed than conventional realizations. In addition, we propose a lightweight gated recurrent unit (GRU) with negligible accuracy loss to ensure the recognition performance. We also implement and fabricate the proposed KWS system with the CMOS 0.18μm process. In the system-view evaluation results, the hardware-software co-design architecture achieves 65.6% energy consumption saving and 71 times speed up than state of the art.
What problem does this paper attempt to address?