Abstract:This paper presents a new inductive learning algorithm based on the extension matrix theory, and uses it to solve the prosodic phrasing problem for Chinese Text-to-Speech systems. Authors propose a novel definition of the consistency of a rule and of a set of positive examples, and reveal their relationship using a theorem: By dividing the positive examples of a specific class in a given example set into consistent groups and adopting a simple strategy to find a conjunctive rule for each group which covers all the group's positive examples and none of the negative examples, the algorithm finds a set of consistent rules in the form of variable-valued logic. Authors collect 937 sentences of different genres (about 78 minutes length) from CCTV news program and built a large speech corpus. A group of features for modeling prosody are also proposed, and their effectiveness is measured by the interpretation of the resulting rules. Lastly, a serial of experiments are conducted. The data is divided into two parts: training set and test set, and the experimental results show that authors' method achieves higher accuracy, better interpretation and less rules than other algorithms. And the generated rules are quite similar to hand-crafted ones, which may help us better understand the relationship between Chinese syntax and prosody.

Chinese prosodic phrasing with the source-channel model

Chinese Prosodic Phrasing with a Constraint-Based Approach.

Chinese Prosodic Phrasing with Extended Features.

A Tree-Based Model of Prosodic Phrasing for Chinese Text-to-Speech Systems

Training Prosodic Phrasing Rules for Chinese TTS Systems

Prosodic phrasing with in †

Modeling Prosody Patterns for Chinese Expressive Text-to-speech Synthesis

Prosodic boundary prediction based on maximum entropy model with error-driven modification

Prosodic Phrasing with Inductive Learning.

Chinese Prosodic Phrasing Based on Extension Matrix Theory

Prosodic Modeling with Rich Syntactic Context in HMM-based Mandarin Speech Synthesis

Statistical Model Based on Probability Frequency for Mandarin Prosodic Structure Prediction

Modeling prosody pattern of Chinese expressive speech and its application in personalized speech conversion

A Maximum Entropy Based Hierarchical Model for Automatic Prosodic Boundary Labeling in Mandarin

Maximum Entropy Based Tone Modeling for Mandarin Speech Recognition

Prosody Model for Mandarin Text-to-Speech System

Predicting Chinese Prosodic Phrase with Height of Syntax Tree

Prosodic Phrase Analysis based on Probability and Statistics

A Superposed Prosodic Model for Chinese Text-To-Speech Synthesis

HIERARCHICAL PROSODY MODELING FOR NON-AUTOREGRESSIVE SPEECH SYNTHESIS

Modeling Prosodic Phrasing with Multi-Task Learning in Tacotron-based TTS