Detection of Dynamic Structures of Speech Fundamental Frequency in Tonal Languages

Hong Hong,Zhengmin Zhao,Xinlong Wang,Zhiyong Tao
DOI: https://doi.org/10.1109/LSP.2010.2058799
2010-01-01
Abstract:An approach is proposed specially for capturing fine dynamic structures of speech fundamental frequency F0 that may vary in such a nonmonotonic way as those of the third tones in Chinese speech. It first estimates the rough trend of variation of a F0 contour by means of the cepstrum technique, and then, utilizes the trend as a reference to track the variation and calculates the detailed contour from a few of intrinsic mode functions that are decomposed by the ensemble empirical mode decomposition. Intensive evaluation and direct comparisons with existing methods are conducted with the standard Chinese Mandarin database, showing the effectiveness of the proposed method in acquiring accurate and reliable F0 contours from speech signals even heavily contaminated with noise.
What problem does this paper attempt to address?