Syllable Based Audio Search Using Confusion Network Arc As Indexing Unit

Jian Shao,Pengyuan Zhang,Jiang Han,Jun Yang,Yonghong Yan
2006-01-01
Abstract:Compared to English, Chinese has a simpler and more restricted syllabic structure. In order to exploit the special characteristics of Chinese, syllable is selected as the unit for ASR lattice representation. For the sake of fast retrieval, syllable lattices are clustered into confusion network linear lattices, and then encoded into inverted index. To recover the posterior probabilities of pruned word hypotheses in confusion network, syllable confusion matrix is used to calculate relevance score of a given keyword. Experiments on the corpora for the keyword spotting task in the 2005 HTRDP ASR Evaluation show that the proposed approach not only yields a compact inverted index and supports quick keyword query, but also achieves an EER of 46.75%.
What problem does this paper attempt to address?