N-gram FST Indexing for Spoken Term Detection

Chao Liu,Dong Wang,Javier Tejedor
DOI: https://doi.org/10.21437/interspeech.2012-558
2012-01-01
Abstract:An efficient indexing scheme is essentially important for spoken term detection (STD) on large databases, particularly for phone-based systems that have been widely adopted to achieve vocabulary-independent detection. While the finite state transducer (FST) composition provides a standard indexing approach, the n-gram reverse indexing is more flexible in connectivity representation and confidence measuring and therefore may result in better performance than searching within the original lattices or the equivalent FSTs. In this paper we present an n-gram FST indexing approach which combines the flexibility of n-gram indexing and the efficiency of FST indexing. Specifically, we employ the n-gram indexing to relax connectivity in original lattices and then formalize the indices into an FST for online search. We demonstrate this approach with a phone-based STD task where the lattice is sparse due to strong language models. The results show that n-gram FST indexing provides not only better detection performance than lattice search, but a faster detection than both conventional n-gram and FST indexing.
What problem does this paper attempt to address?