Global context-dependent recurrent neural network language model with sparse feature learning

Hongli Deng,Lei Zhang,Lituan Wang
DOI: https://doi.org/10.1007/s00521-017-3065-x
2017-06-21
Neural Computing and Applications
Abstract:Recurrent neural network language models (RNNLMs) are an important type of language model. In recent years, context-dependent RNNLMs are the most widely used ones as they apply additional information summarized from other sequences to access the larger context. However, when the sequences are mutually independent or randomly shuffled, these models cannot learn useful additional information, resulting in no larger context taken into account. In order to ensure that the model can obtain more contextual information in any case, a new language model is proposed in this paper. It can capture the global context just by the words within the current sequences, incorporating all the preceding and following words of target, without resorting to additional information summarized from other sequences. This model includes two main modules: a recurrent global context module used for extracting the global contextual information of the target and a sparse feature learning module that learns the sparse features of all the possible output words to distinguish the target word from others at the output layer. The proposed model was tested on three language modeling tasks. Experimental results show that it improves the perplexity of the model, speeds up the convergence of the network and learns better word embeddings compared with other language models.
computer science, artificial intelligence
What problem does this paper attempt to address?