Though this be hesitant, yet there is method in ’t: Effects of disfluency patterns in neural speech synthesis for cultural heritage presentations

Loredana Schettino,Antonio Origlia,Francesco Cutugno
DOI: https://doi.org/10.1016/j.csl.2023.101585
IF: 3.252
2024-04-01
Computer Speech & Language
Abstract:This study presents the results of two perception experiments aimed at evaluating the effect that specific patterns of disfluencies have on people listening to synthetic speech. We consider the particular case of Cultural Heritage presentations and propose a linguistic model to support the positioning of disfluencies throughout the utterances in the Italian language. A state-of-the-art speech synthesizer, based on Deep Neural Networks, is used to prepare a set of experimental stimuli and two different experiments are presented to provide both subjective evaluations and behavioural assessments from human subjects. Results show that synthetic utterances including disfluencies, predicted by a linguistic model, are identified as more natural and that the presence of disfluencies benefits the listeners’ recall of the provided information.
computer science, artificial intelligence
What problem does this paper attempt to address?