On the Prosody Control Characteristics of Nonverbal Utterances and Its Application to Communicative Prosody Generation
Ke Li,Yoko Greenberg,Nagisa Shibuya,Yoshinori Sagisaka,Nick Campbell
DOI: https://doi.org/10.1121/1.4787475
2006-01-01
Abstract:In this paper, prosodic characteristics of nonverbal utterances were analyzed using an F0 generation model proposed by Fujisaki aiming at communicative speech generation. From the analysis, the different distributions of F0 generation parameters have been observed for prototypical four dynamic patterns (rise, gradual fall, fall, and rise&down). Since former works have shown that these differences can correspond to their impressions (such as confident-doubtful, allowable-unacceptable, and positive-negative) expressed by multi-dimensional vectors, we tried to make a computational model from impression vector to F0 generation parameters. By employing a statistical optimization technique, we have achieved the mapping from impression vectors to prosody generation parameters. Perceptual evaluation tests using neutral words have confirmed the effectiveness of the mapping to provide communicative speech prosody. [Work supported in part by Waseda Univ. RISE research project of ‘‘Analysis and modeling of human mechanism in speech and language processing’’ and Grant-in-Aid for Scientific Research B-2, No. 18300063 of JSPS.]