The IBM Submission to the 2008 Text-to-Speech Blizzard Challenge

Raul Fernandez,Zvi Kons,Slava Shechtman,Zhiwei Shuang,Ron Hoory,Bhuvana Ramabhadran,Yong Qin
DOI: https://doi.org/10.21437/blizzard.2008-9
2008-01-01
Abstract:The 2008 Blizzard speech synthesis challenge provided participants with an opportunity to evaluate their systems in UK English and Mandarin. This paper describes the work behind three IBM systems submitted to the challenge for these two languages. The systems presented are concatenative unit-selection text-to-speech synthesis systems consisting of a core algorithmic base, as well as some algorithmic variants introduced not just to address the language-specific component of the synthesis engines (i.e., text-processing front-end) but also to better serve the different properties of different language types (i.e., tonal nature of Mandarin). The resulting systems were evaluated with several tasks designed to address issues like overall naturalness, intelligibility and the preservation of speaker identity. All the IBM systems submitted achieved very good performance in the two languages across the different tasks reported in this paper.
What problem does this paper attempt to address?