Learning Low-Rank Structured Sparsity in Recurrent Neural Networks

Weijing Wen,Fan Yang,Yangfeng Su,Dian Zhou,Xuan Zeng
DOI: https://doi.org/10.1109/ISCAS45731.2020.9181239
2020-01-01
Abstract:Acceleration and wide deployability in deeper recurrent neural network is hindered by high demand for computation and memory storage on devices with memory and latency constraints. In this work, we propose a novel regularization method to learn hardware-friendly sparse structures for deep recurrent neural networks. Considering the consistency of dimension in continuous time units in recurrent neural networks, low-rank structured sparse approximations of the weight matrices are learned through the regularization without dimension distortion. Our method is architecture agnostic and can learn compact models with higher degree of sparsity than the state-of-the-art structured sparsity learning method. The structured sparsity rather than random sparsity also facilitates the hardware implementation. Experiments on language modeling of Penn TreeBank dataset show that our approach can reduce the parameters of stacked recurrent neural network model by over 90% with less than 1% perplexity loss. It is also successfully evaluated on larger highway neural network model with word2vec dataset like enwik8 and text8 using only 20M weights.
What problem does this paper attempt to address?