A Fast Algorithm for Computing Prefix Probabilities

Franz Nowak,Ryan Cotterell
DOI: https://doi.org/10.18653/v1/2023.acl-short.6
2024-03-18
Abstract:Multiple algorithms are known for efficiently calculating the prefix probability of a string under a probabilistic context-free grammar (PCFG). Good algorithms for the problem have a runtime cubic in the length of the input string. However, some proposed algorithms are suboptimal with respect to the size of the grammar. This paper proposes a novel speed-up of Jelinek and Lafferty's (1991) algorithm, whose original runtime is $O(n^3 |N|^3 + |N|^4)$, where $n$ is the input length and $|N|$ is the number of non-terminals in the grammar. In contrast, our speed-up runs in $O(n^2 |N|^3+n^3|N|^2)$.
Formal Languages and Automata Theory,Data Structures and Algorithms
What problem does this paper attempt to address?
The paper primarily addresses the problem of efficiently computing the prefix probability of strings in Probabilistic Context-Free Grammars (PCFGs). Specifically, the authors propose an improved version of the Jelinek and Lafferty (1991) algorithm to enhance computational efficiency. The main contributions of the paper can be summarized as follows: 1. **Problem Background**: - Probabilistic Context-Free Grammars (PCFGs) are widely used in Natural Language Processing (NLP) for building language models. - When using PCFG as a language model, it is necessary to compute the prefix probability, i.e., the probability that the grammar generates a given string as the beginning of a derivation. - Existing efficient algorithms include those proposed by Jelinek and Lafferty (1991) and Stolcke (1995), but these algorithms are less efficient when dealing with larger grammars. 2. **Proposed Method**: - The paper proposes a new algorithm that improves upon the original Jelinek and Lafferty algorithm, enhancing computational efficiency for dense grammars. - The improved algorithm has a time complexity of \(O(N^2|N|^3 + N^3|N|^2)\), where \(N\) is the length of the input string and \(|N|\) is the number of non-terminals. - This time complexity is better than the original algorithm's \(O(N^3|N|^3 + |N|^4)\), especially when dealing with dense grammars containing a large number of non-terminals. 3. **Technical Details**: - By reorganizing the computation formula for prefix probabilities and introducing additional memoization strategies, the algorithm reduces redundant calculations. - The CKY algorithm is used to precompute "inside probabilities," and additional data structures \(\gamma\) and \(\delta\) are utilized to store intermediate results, further improving computational efficiency. 4. **Scope of Application**: - The proposed algorithm is applicable to PCFGs in Chomsky Normal Form (CNF) and can be extended to Semiring-Weighted CFGs. In summary, the main contribution of this paper is the proposal of a more efficient algorithm for computing prefix probabilities under PCFGs, particularly demonstrating better performance when handling large-scale grammars. This has significant implications for the construction of language models in practical applications.