Combining ANNs to improve phone recognition

B. Mak
DOI: https://doi.org/10.1109/ICASSP.1997.595487
1997-04-21
Abstract:In applying neural networks to speech recognition, one often finds that slightly different training configurations lead to significantly different networks. Thus different training sessions using different setups will likely end up in "mixed" network configurations representing different solutions in different regions of the data space. This sensitivity to the initial weights assigned, the training parameters and the training data can be used to enhance performance, using a committee of neural networks. We study various ways to combine context-dependent (CD) and context-independent (CD) neural network phone estimators to improve phone recognition. As a result, we obtain 6.3% and 2.2% increase in accuracy in phone recognition using monophones and biphones respectively.
What problem does this paper attempt to address?