E-LSTM: Efficient Inference of Sparse LSTM on Embedded Heterogeneous System

Runbin Shi,Junjie Liu,Hayden K.-H. So,Shuo Wang,Yun Liang
DOI: https://doi.org/10.1145/3316781.3317813
2019-01-01
Abstract:Various models with Long Short-Term Memory (LSTM) network have demonstrated prior art performances in sequential information processing. Previous LSTM-specific architectures set large on-chip memory for weight storage to alleviate the memory-bound issue and facilitate the LSTM inference in cloud computing. In this paper, E-LSTM is proposed for embedded scenarios with the consideration of the chip-area and limited data-access bandwidth. The heterogeneous hardware in E-LSTM tightly couples an LSTM co-processor with an embedded RISC-V CPU. The eSELL format is developed to represent the sparse weight matrix. With the proposed cell fusion optimization based on the inherent sparsity in computation, E-LSTM achieves up to 2.2× speedup of processing throughput.
What problem does this paper attempt to address?