Word Activation Forces-Based Language Modeling and Smoothing

Min Qin,Gang Liu,Baoxiang Li,Yueming Lu
DOI: https://doi.org/10.1109/IHMSC.2013.140
2013-01-01
Abstract:N-gram language models are useful for modeling the local dependencies of word occurrences but not for capturing global word dependencies. When the window size n is limited, the n-gram is weak in terms of capturing long distance dependencies. Long-distance Dependency information has long been proven useful in language model. However, the improved performance of long-distance LMs over conventional n-gram models generally comes at the cost of increased decoding complexity and model size. Word Activation Forces has been proven a simple and human-comparable accurate measure to identify word closest associates. In this paper, Word Activation Forces-Based language model is proposed to capture the long distance dependency between words, but which is as fast for decoding as a conventional word n-gram. As shown by experiments on broadcast news, the proposed language modeling and smoothing can significantly reduce the perplexity of language models and word error rate with moderate computational cost.
What problem does this paper attempt to address?