Abstract:An adaptive statistical language model is described, which successfully integrates long distance linguistic information with other knowledge sources. Most existing statistical language models exploit only the immediate history of a text. To extract information from further back in the document's history, we propose and usetrigger pairsas the basic information bearing elements. This allows the model to adapt its expectations to the topic of discourse. Next, statistical evidence from multiple sources must be combined. Traditionally, linear interpolation and its variants have been used, but these are shown here to be seriously deficient. Instead, we apply the principle of Maximum Entropy (ME). Each information source gives rise to a set of constraints, to be imposed on the combined estimate. The intersection of these constraints is the set of probability functions which are consistent with all the information sources. The function with the highest entropy within that set is the ME solution. Given consistent statistical evidence, a unique ME solution is guaranteed to exist, and an iterative algorithm exists which is guaranteed to converge to it. The ME framework is extremely general: any phenomenon that can be described in terms of statistics of the text can be readily incorporated. An adaptive language model based on the ME approach was trained on theWall Street Journalcorpus, and showed a 32–39% perplexity reduction over the baseline. When interfaced to SPHINX-II, Carnegie Mellon's speech recognizer, it reduced its error rate by 10–14%. This thus illustrates the feasibility of incorporating many diverse knowledge sources in a single, unified statistical framework.

Maximum Entropy based Rule Selection Model for Syntax-based Statistical Machine Translation.

Tree-State Based Rule Selection Models for Hierarchical Phrase-Based Machine Translation.

An Improved Maximum Entropy Language Model and Its Application

Translation rules extraction for statistical machine translation

Maximum Rank Correlation Training for Statistical Machine Translation.

Statistical Machine Translation Based on Translation Rules

Model and Simulation of Maximum Entropy Phrase Reordering of English Text in Language Learning Machine

A maximum entropy approach to adaptive statistical language modelling

A Context-Aware Topic Model for Statistical Machine Translation.

Syntax-Aware Complex-Valued Neural Machine Translation

Statistical Translation Model Based On Source Syntax Structure

Transductive Minimum Error Rate Training for Statistical Machine Translation.

Combined maximum entropy language model using different feature sets

Incorporating Linguistic Structure into Maximum Entropy Language Models

Syntax-Enhanced Neural Machine Translation with Syntax-Aware Word Representations

An EM Algorithm for SCFG in Formal Syntax-Based Translation

Prior Derivation Models For Formally Syntax-Based Translation Using Linguistically Syntactic Parsing and Tree Kernels.

A Hybrid Model for Enhancing Lexical Statistical Machine Translation (SMT)

Coordinate System Selection for Minimum Error Rate Training in Statistical Machine Translation

Basic Grammar Rule and Maximum Entropy Based Hybrid Model for Named Entity Recognition

CHINESE-ENGLISH MACHINE TRANSLATION DISAMBIGUATING WITH RULE-BASED METHOD COMBINED WITH STATISTIC-BASED METHOD