GhostRNN: Reducing State Redundancy in RNN with Cheap Operations

Hang Zhou,Xiaoxu Zheng,Yunhe Wang,Michael Bi Mi,Deyi Xiong,Kai Han
DOI: https://doi.org/10.21437/Interspeech.2023-2417
2024-11-20
Abstract:Recurrent neural network (RNNs) that are capable of modeling long-distance dependencies are widely used in various speech tasks, eg., keyword spotting (KWS) and speech enhancement (SE). Due to the limitation of power and memory in low-resource devices, efficient RNN models are urgently required for real-world applications. In this paper, we propose an efficient RNN architecture, GhostRNN, which reduces hidden state redundancy with cheap operations. In particular, we observe that partial dimensions of hidden states are similar to the others in trained RNN models, suggesting that redundancy exists in specific RNNs. To reduce the redundancy and hence computational cost, we propose to first generate a few intrinsic states, and then apply cheap operations to produce ghost states based on the intrinsic states. Experiments on KWS and SE tasks demonstrate that the proposed GhostRNN significantly reduces the memory usage (~40%) and computation cost while keeping performance similar.
Computation and Language,Artificial Intelligence,Sound,Audio and Speech Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: when deploying Recurrent Neural Network (RNN) models on resource - constrained devices (such as mobile phones), how to significantly reduce the computational and memory costs while maintaining high performance. Specifically, the authors observed that there is redundancy in the hidden states of trained RNN models, which leads to unnecessary computational and storage overheads. Therefore, they proposed a new RNN architecture - GhostRNN, which generates "ghost states" through inexpensive operations to reduce the redundancy of hidden states, thereby reducing computational and memory requirements. ### Main problem summary: 1. **High computational and memory costs**: When traditional RNN models are deployed on low - resource devices, it is difficult to meet the requirements of practical applications due to high computational and memory requirements. 2. **Redundancy in hidden states**: In trained RNN models, some hidden state dimensions are similar and there is a redundancy phenomenon, which increases the unnecessary computational burden. ### Solutions: - Propose the GhostRNN architecture. By generating a small number of "intrinsic states" and then using inexpensive operations (such as simple linear transformations and activation functions) to generate more "ghost states", the redundancy of hidden states can be reduced. - Verified by experiments, GhostRNN can significantly reduce the number of parameters, computational complexity and memory usage while maintaining performance. ### Experimental results: - In the Keyword Spotting (KWS) task, the GhostRNN model achieved a 0.1% improvement in accuracy on the Google Speech Commands dataset, and the number of parameters was reduced by about 40%. - In the Speech Enhancement (SE) task, the GhostRNN model improved SDR and Si - SDR by about 0.1 dB on the LibriMix dataset, while the number of parameters was reduced by about 40%. In conclusion, this paper aims to develop an efficient RNN architecture by reducing the redundancy of hidden states in RNN models to meet the practical application requirements of resource - constrained devices.