Highly Efficient Neural Network Language Model Compression Using Soft Binarization Training

Rao Ma,Qi Liu,Kai Yu
DOI: https://doi.org/10.1109/asru46091.2019.9003744
2019-01-01
Abstract:The long short-term memory language model (LSTM LM) has been widely investigated in large vocabulary continuous speech recognition (LVCSR) task. Despite the excellent performance of LSTM LM, its usage in resource-constrained environments, such as portable devices, is limited due to the high consumption of memory. Binarized language model has been proposed to achieve significant memory reduction at the cost of performance degradation at high compression ratio. In this paper, we propose a soft binarization approach to recover the performance of binarized LSTM LM. Experiments show that the proposed method can achieve a high compression rate of 30 × with almost no performance loss in both language modeling and speech recognition tasks.
What problem does this paper attempt to address?