A Preliminary Study on GMM Weight Transformation for Emotional Speaker Recognition

Li Chen,Yingchun Yang
DOI: https://doi.org/10.1109/acii.2013.12
2013-01-01
Abstract:The performance of speaker recognition system degrades when the emotional states are inconsistent during the enrollment and evaluation stage. Emotional GMM model synthesis, such as NEGT (Neutral-Emotional GMM mean Transformation), is one way to reduce this degradation. This paper discovers that GMM weight transformation is also feasible and the number of parameters that need to be modified is much less than that of GMM mean ransformation. Thus, we propose two algorithms: RBFNN (Radial Basis Function Neural Network) and EBSR (Exemplar Based Sparse Representation) based GMM weight transformation to model the neutral-to-emotion weight transformation law for emotional GMM model synthesis. The experiments carried on MASC show that IR has been increased by 6.91% and 5.74% through these two algorithms respectively, compared with that of the GMM-UBM system. Meanwhile, these two algorithms require less development data and time compared with those of NEGT.
What problem does this paper attempt to address?