1-Bit WaveNet: Compressing a Generative Neural Network in Speech Recognition with Two Binarized Methods

Sicheng Gao,Runqi Wang,Liuyang Jiang,Baochang Zhang
DOI: https://doi.org/10.1109/iciea51954.2021.9516334
2021-01-01
Abstract:With the advancement of deep convolutional neural networks, speech recognition systems achieved the amazing performance in the tasks of natural language processing field. While being outstanding, resource-constrained environments limited enterprise-level applications. In this paper, we use two binarized neural networks called Bi-real Net and PCNN (Projection Convolutional Neural Networks) to study the problem of compressing WaveNet which is a generative model in raw audio waveforms recognition. In particular, Bi-real Net and PCNN are applied to minimize the computational cost gap between real-valued and binarized WaveNet model, which leads to a new 1-bit dilated causal convolution. We collected a dataset which including over 950,000 clear key word voice without noise. In this dataset, 1-bit WaveNet were trained through these binarizations and got a satisfactory perform.
What problem does this paper attempt to address?