Multiscale Fractal Dimension Based I/F Segmentation for Mandarin Speech

王帆,郑方,吴文虎
DOI: https://doi.org/10.3321/j.issn:1000-0054.2002.01.021
2002-01-01
Abstract:This paper presents a new algorithm for Mandarin speech Initial and Final (I/F) segmentation in adverse environments based on the multiscale fractal dimension. Based on the chaotic characteristics of speech production, the concept and computational method of multiscale fractal dimension (MFD) is extended from the traditional fractal dimension to show the local self similar behavior at multiple maximum resolutions of computation. Analysis of the disparate characteristics in MFD can distinguish clearly between the stable phonemes (Initial and Final parts) and their transient region. So the new segmentation algorithm can directly search the speech frame with the minimum r variance of MFD (the degree of the difference from all elements in a MFD) as the I/F segmentation boundary, due to the special I+F structure of the Mandarin syllable. A segmentation accuracy of 95.2% is obtained for clean speech and 82.3% for noisy speech with the SNR of 10 dB.
What problem does this paper attempt to address?