Neural Network Ensemble Based on Vowel Classification for Chinese Speaker Recognition

Bo Qian,Zhen-min Tang,Yan-ping Li,Li-min Xu,Yan Zhang
DOI: https://doi.org/10.1109/ICNC.2007.495
2007-01-01
Abstract:As we known, features of speech signal not only reflect the identity information, but also contain the semantical information. In this paper, we describe a novel neural network ensemble architecture based on the finding that the diphthong and multi-vowel in Chinese can approximately be considered as the complex of mono- vowel and transitional part in the standpoint of short-term analysis. Several neural networks are trained, each for the eigenspace of one mono-vowel, and their results are combined by another combinational neural network. The architecture can effectively improve the recognition accuracy by eliminating the disturbance of semantical information. Experimental results show that the recognition accuracy of our proposed approach is higher than conventional methods such as a single neural network and other proposed ensemble structures.
What problem does this paper attempt to address?