Abstract:The language model is one of the most important domains in natural language processing. It is a bridge for the computer to identify and comprehend human language, and it is also a sign of Artificial Intelligence development. The language model is popular in Speech Recognition, Machine Translation, Information Retrieval, and Knowledge Mapping. With the rapid expansion of technology and hardware, the language model has experienced a transformation from statistical model to neural network model and then to the deep neural network model. The wide application of depth learning makes language modeling more extensive, complex, and expensive. This paper combines the person-alized input, convolutional neural network (CNN) coding, and the technique of union gate, cooperating with long short-term memory (LSTM) mechanism to improve the language model. The dynamic integration of LSTM and CNN is called Gated CLSTM. In the experiment, we used the deep learning framework Tensorflow to achieve a Gated GLSTM architecture. Besides, some classical optimization techniques, such as noise contrastive estimation and re-current projection layer, were adopted in the experiment. We tested the performance of the Gated CLSTM under an open and big scale corpus set and trained a signal-layer model and a three-layer model to observe how network depth influences the performance. The single-layer model has 4 days of training experience and reduced the perplexity to 42.1 in four GPU console environment. The three-layer model reduced the perplexity to 33.1 in 6 days. Compared with some classical benchmark models, significant improvements have been made by Gated CLSTM considering both hardware and time complexity and perplexity.

Subword language modeling with neural networks

Exploring the Limits of Language Modeling

Neural Network Language Modeling With Letter-Based Features And Importance Sampling

Neural Named Entity Recognition from Subword Units

Word-Level Representation From Bytes For Language Modeling

Long-Short Range Context Neural Networks for Language Modeling

Recurrent Memory Networks for Language Modeling

Neural Language Modeling with Visual Features

A neural probabilistic language model

Word Representation Models for Morphologically Rich Languages in Neural Machine Translation

Deep Neural Networks Language Model Based on CNN and LSTM Hybrid Architecture

RNN Language Model with Word Clustering and Class-Based Output Layer

Compressing Neural Language Models by Sparse Word Representations

A Study on Neural Network Language Modeling

Global context-dependent recurrent neural network language model with sparse feature learning

Enriching Word Vectors with Subword Information

Explicit Word Density Estimation for Language Modelling

Language Modeling for Morphologically Rich Languages: Character-Aware Modeling for Word-Level Prediction

Adapting Word Embeddings to New Languages with Morphological and Phonological Subword Representations

Subspace Chronicles: How Linguistic Information Emerges, Shifts and Interacts during Language Model Training

Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models