Entropy of Energy Operator As Feature for Large Vocabulary Mandarin Speaker Independent Speech Recognition

Fadhil H.T.Al-dulaimy
DOI: https://doi.org/10.21437/icslp.2002-576
2002-01-01
Abstract:This work demonstrates the use of the nonlinear time-frequency distribution (NLTFD) of a discrete time energy operator (DTEO) based on amplitude modulation-frequency modulation demodulation techniques as a feature in speech recognition. The duration distribution based hidden Markov module in a speaker independent large vocabulary mandarin speech recognition system was reconstructed from the feature vectors in the front-end detection stage. The goal was to improve the performance of the existing system by combining new features to the baseline feature vector. This paper also deals with errors associated with using a pre-emphasis filter in the front end processing of the present scheme, which causes an increase in the noise energy at high frequencies above 4 kHz and in some cases degrades the recognition accuracy. The experimental results show that eliminating the pre-emphasis filters from the pre-processing stage and using NLTFD with compensated DTEO combined with Mel frequency cepstrum components give a 21.95% reduction in the relative error rate compared to the conventional technique with 25 candidates used in the test.
What problem does this paper attempt to address?