Multimedia Simultaneous Translation System for Minority Language Communication with Mandarin
Shen Huang,Bojie Hu,Shan Huang,Pengfei Hu,Jian Kang,Zhiqiang Lv,Jinghao Yan,Qi Ju,Shiyin Kang,Deyi Tuo,Guangzhi Li,Nurmemet Yolwas
2019-01-01
Abstract:Speech recognition for minority language is always behind main stream due to lack of resources. This work presents a system for simultaneous translation between Mandarin and major minority languages such as Uyghur, Tibetan in shape of speech, text and images. The general acoustic model is trained via factorized TDNN with lattice free MMI criteria using mixed-units based lexicon model. For each specific language, acoustic model is trained by multi-task mix-lingual modeling with shared bottleneck layers followed by transfer learning. Besides, the system also supports state-of-the-art OCR, TTS, and machine translation, by which language information will be real-time translated, punctuated and pronounced. The machine translation behind the system gets a high rank in WMT 18 Mandarin-English and CWMT 18 minority language translation task. The system has integrated into a micro-app at WeChat and can facilitate communication between Mandarin and Minority languages.