“AI News Anchor” With Deep Learning-Based Speech Synthesis

Kiyoshi Kurihara,T. Fukaya,Satoshi Suzuki,N. Seiyama,Kazunari Saito,T. Kumano
DOI: https://doi.org/10.5594/JMI.2021.3057703
2021-04-01
SMPTE Motion Imaging Journal
Abstract:Deep learning-based text to speech (TTS) is used in various situations, and the sound quality is close to that of humans. We previously developed a news-specific deep learning-based TTS (DL-TTS) system and implemented it with our AI news anchor for live broadcast programs and automatic news-speech distribution services. We also developed our DL-TTS system for the control of speaking style and speech rate, pitch, intonation, and volume to facilitate the creation of various programs. More specifically, this method enables the changing of specific speaking styles, such as news style, which mimics the style of news reporters, and conversation style. The purpose of creating this system was to eliminate the discomfort due to differences in speech and speaking styles. Controlling speaking style is important in news speech because a mismatched speaking style does not appropriately convey news articles. For this study, we conducted an evaluation experiment on the conveying of simple news articles for language learners regarding speaking-style control and found appropriate speaking styles for automatically generated news speech. We conducted another evaluation experiment on whether synthetic speech generated from our system for “easy news” for Japanese language learners can help people understand the news in Japanese. We also discuss practical applications of our system. Our news-specific deep neural network-based TTS system was found to effectively provide news services to broadcast stations. In the future, we will consider various use cases of flexible production by using a cloud system. The coronavirus pandemic has forced broadcasters to adopt new working styles. Thus, we will explore a new production system, such as a cloud-based system, for news-speech automation for this new normal .
Computer Science
What problem does this paper attempt to address?