NS-FDN: Near-Sensor Processing Architecture of Feature-Configurable Distributed Network for Beyond-Real-Time Always-on Keyword Spotting
Qin Li,Changlu Liu,Peiyan Dong,Yanming Zhang,Tong Li,Sheng Lin,Minda Yang,Fei Qiao,Yanzhi Wang,Li Luo,Huazhong Yang
DOI: https://doi.org/10.1109/tcsi.2021.3059649
2021-01-01
IEEE Transactions on Circuits and Systems I Regular Papers
Abstract:Always-on keyword spotting (KWS) that detects wake-up words has been the indispensable module in the voice interaction system. However, the ultra-low-power embedded devices put forward strict requirements on energy consumption, latency, and recognition accuracy of KWS. In this work, we propose a near-sensor processing architecture of feature-configurable distributed network (NS-FDN) for always-on KWS applications. The proposed distributed network adapts to the flexible keywords demands in the actual scene by splitting the conventional single network into distributed sub-networks. We design a channel-independent training framework to improve the recognition accuracy of distributed networks. The speech features are evaluated and the redundancy is reduced in NS-FDN, which can also configure the speech features to further reduce the computing complexity and improve processing speed. For deeper optimization, we implement a 65nm-process prototype chip with near-sensor mixed-signal processing architecture avoiding energy-consuming analog-to-digital converter. By improving the system, algorithm, and hardware designs of the KWS, our co-optimized architecture eliminates the energy consumption bottleneck long-standing in conventional KWS systems and achieves state-of-the-art system performance. The experiment results show that NS-FDN achieves 31.6% energy consumption savings, 1.6 times memory savings, 57 times speedup, and 3.4% higher recognition accuracy compared with the state of the art.