An Improved Maximum Entropy Language Model

GL Fang,G Wen
DOI: https://doi.org/10.1109/icosp.2002.1179977
2002-01-01
Abstract:An improved maximum entropy language model (IMELM) is presented based on three respects of language modeling (LM) improvement: the solution of long dependences, the integration of language knowledge into LM, and the general framework that combines all kinds of language knowledge. The proposed model combines trigram with base phrase structure knowledge in this paper. Trigram is used to capture the local relation between words, while base phrase structure knowledge is considered to represent the long-distance relations between syntactical structures. The knowledge of syntax, semantics and word is integrated in the maximum entropy framework. The experimental results show that the proposed model has a 24% improvement in perplexity over the conventional trigram model.
What problem does this paper attempt to address?