Spoken Term Detection from Bilingual Spontaneous Speech Using Code-Switched Lattice-Based Structures for Words and Subword Units.

Hung-Yi Lee,Yueh-Lien Tang,Hao Tang,Lin-Shan Lee
DOI: https://doi.org/10.1109/asru.2009.5372901
2009-01-01
Abstract:This paper presents the first work known publicly on spoken term detection from bilingual spontaneous speech using code-switched lattice-based structures for word and subword units. The corpus used is the lectures with Chinese as the host language and English as the guest language recorded for a real course offered in National Taiwan University. The techniques reported here have been successfully implemented and tested in a real lecture system now available on-line over the Internet. We also present the approaches of using word fragment as the subword unit for English, and analyse the difficult issues when code-switched lattice-based structures for subword units are used for tasks involving languages of quite different natures.
What problem does this paper attempt to address?