Batch Normalization based Unsupervised Speaker Adaptation for Acoustic Models

Jiangyan Yi,Jianhua Tao
DOI: https://doi.org/10.1109/APSIPAASC47483.2019.9023185
2019-01-01
Abstract:This paper proposes a simple yet effective unsupervised speaker adaptation approach for batch normalization based deep neural network acoustic models. The basic idea of this approach is to recompute means and variances in all batch normalization layers over the test data for every speaker. Thus the distribution of the test data can be close to the training data. This approach doesn't need to adjust any trainable parameters of the acoustic model. Experiments are conducted on CHiME-3 datasets. The results show that the proposed adaptation obtains improvement on the real test set by 2.17 % relative average word error rate (WER) reduction when compared with the scaling and shifting factors (SSF) adaptation. Combining our proposed MV adaptation with the SSF adaptation obtains further improvement.
What problem does this paper attempt to address?