Rapid development of a Latvian speech-to-text system

Ilya Oparin,Lori Lamel,Jean-Luc Gauvain
DOI: https://doi.org/10.1109/icassp.2013.6639082
2013-05-01
Abstract:This paper describes the development of a Latvian speech-to-text (STT) system at LIMSI within the Quaero project. One of the aims of the speech processing activities in the Quaero project is to cover all official European languages. However, for some of the languages only very limited, if any, training resources are available via corpora agencies such as LDC and ELRA. The aim of this study was to show the way, taking Latvian as example, an STT system can be rapidly developed without any transcribed training data. Following the scheme proposed in this paper, the Latvian STT system was developed in about a month and obtained a word error rate of 20% on broadcast news and conversation data in the Quaero 2012 evaluation campaign.
What problem does this paper attempt to address?