Mongolian Text-to-Speech Challenge Under Low-Resource Scenario for NCMMSC2022

Rui Liu,Zhen-Hua Ling,Yi-Fan Hu,Hui Zhang,Guang-Lai Gao
DOI: https://doi.org/10.1007/978-981-99-2401-1_20
2023-01-01
Abstract:Mongolian Text-to-Speech (TTS) Challenge under Low-Resource Scenario is a special session for National Conference on Man-Machine Speech Communication 2022 (NCMMSC2022), termed as NCMMSC2022-MTTSC. A Mongolian TTS dataset was provided to participants this year, and a low-resource Mongolian TTS task was designed. Specifically, the task is to synthesize high-quality Mongolian speech with given Mongolian scripts. Thirteen teams submitted their results for final evaluation. Mean opinion score (MOS) listening tests were conducted online to measure the naturalness, intelligibility of the synthetic speech. In addition, the word error rate (WER) of automatic speech recognition was further treated as the objective metric for intelligibility evaluation. The evaluation results show that the top system achieved comparable naturalness and intelligibility with the ground truth speech.
What problem does this paper attempt to address?