Speech reconstruction using a generalized HSMM (GHSMM)

Michael D. Moore,Michael I. Savic
DOI: https://doi.org/10.1016/j.dsp.2003.07.003
IF: 2.92
2004-01-01
Digital Signal Processing
Abstract:Speech reconstruction is a relatively new application for stochastic processes such as the hidden Markov model (HMM) and hidden semi-Markov model (HSMM). While reconstruction has been attempted within the acoustic (actual speech) vector level, statistical reconstruction at the phoneme level has received less attention. Because the regeneration time (memory) of the HMM is on the order of a single acoustic vector, HMMs are relatively unsuited for reconstruction. HSMMs have a regeneration time (memory) that is on the order of a single phoneme, and thus are capable of reconstructing multiple damaged acoustic vectors within phonemes. We describe a dual-regeneration time generalized HSMM (GHSMM) that can reconstruct damaged acoustic vectors and multiple damaged phonemes in a longer utterance. This GHSMM uses a non-stationary transition matrix that is constructed to operate over two time scales—the regeneration time of a single phoneme and the regeneration time of an utterance.
What problem does this paper attempt to address?