Switching Auxiliary Chains for Speech Recognition

Hui Lin,Zhijian Ou
DOI: https://doi.org/10.1109/LSP.2006.891314
2007-01-01
Abstract:This letter investigates the problem of incorporating auxiliary information, e.g., pitch, zero crossing rate (ZCR), and rate-of-speech (ROS), for speech recognition using dynamic Bayesian networks. In this letter, we propose switching auxiliary chains for exploiting different auxiliary information tailored to different phonetic states. The switching function can be specified by a priori knowledge or, more flexibly, be learned from data with information-theoretic dependency selection. Experiments on the OGI Numbers database show that the new model achieves 7% word-error-rate relative reduction by jointly exploiting pitch, ZCR, and ROS, while keeping almost the same parameter size as the standard HMM.
What problem does this paper attempt to address?