An HMM-based Segment Quantizer and Its Application to Low Bit Rate Speech Coding
Motoyuki Suzuki,Masashi Adachi,Minoru Kohata,Akinori Ito,Shozo Makino,Fuji Ren
2010-01-01
Abstract:Several speech coding systems employ a segment quantizer instead of a vector quantizer. One of the most important problems is how to construct a segment codebook. In this paper, a new speech coder based on the ML-BEATS is proposed. The ML-BEATS is one of the HMM-based segment quantizer. First, it splits a vector sequence into several sub-sequences, and then these sub-sequences are clustered in order to construct a codebook. Each cluster center is represented by a left-to-right HMM. In the encoding process, input speech is matched with HMMs in the codebook, and then HMM index and duration information are sent to the decoder. In the decoding process, a decoded sequence is generated from HMM parameters by applying the HMM-based speech synthesis method. From the experimental results, the HMM-based speech coder gave 1.13 dB spectral distortion with 5.83 bit/frame. It is 0.11 dB higher spectral distortion than that given by G.729 coder, but bit rate decreased only 32%. In order to consider a shifting problem of LSP dimensions, we also propose a new codebook construction method. Many training vectors are extracted from training samples by shifting dimensions, and all vectors are used for constructing a universal codebook. The universal codebook can deal with any shifted vectors because all possibilities are included in the training data. From the experimental results, the shifted vector method encoded an input speech with very low bit rate, but it gave higher spectral distortions.