Precise Phone Boundary Detection using Selective Context-dependent Acoustic Refinement

Sirinoot Boonsuk,P. Punyabukkana,A. Suchato
Abstract:This paper proposes an automatic method for locating phone boundaries in speech utterances based on HMMbased forced alignment together with some context-dependent refinements. HMM-based forced alignment has been a preferred method for speech segmentation in many applications. However, the resulting boundaries are usually not consistent with real boundaries, defined based on abrupt changes in acoustic properties, which are often required in a number of applications such as landmarkand segment-based speech recognizers. In this work, the boundary refinement process fine-tunes boundary locations by utilizing acoustic features selected according to boundary types, categorized by their manners of articulation. Several acoustic measurements are combined with spectral features to form the set of acoustic features to be selected from. Performances of the refinements on different boundary types are investigated on a development set. Refinements working well are identified and these refinements are applied to an unseen test set. The maximum accuracy of 87.13% is achieved from speakerindependent speech segmentation experiments on 490 continuous speech utterances. This introduces a 14.7% error reduction when compared to the baseline HMM-based forced alignment.
What problem does this paper attempt to address?