Exploring Sequential Characteristics in Speaker Bottleneck Feature for Text-Dependent Speaker Verification.

Liping Chen,Yong Zhao,Shi-Xiong Zhang,Jie Li,Guoli Ye,Frank Soong
DOI: https://doi.org/10.1109/icassp.2018.8462467
2018-01-01
Abstract:In this paper, given the speaker bottleneck feature vectors extracted with speaker discriminant neural networks, we focus on using the sequential speaker characteristics for text-dependent speaker verification. In each evaluation trial, speaker supervectors are used as the representations of the sequential speaker characteristics rendered in the compared speech utterances. To this end, dynamic time warping is used to warp the variable-length speaker feature vector sequences of the utterances to the same length. Thereafter for every utterance, a speaker supervector can be obtained as the concatenation of its speaker feature vectors. We use Euclidean distance and support vector machine (SVM) to compute the decision score on the speaker supervectors. Our experiments on a Microsoft internal keyword-spotting database showed the effectiveness of the proposed speaker supervector for text-dependent speaker verification. Moreover, when SVM backend was used in scoring, the speaker supervector achieved the best EER performance 1.627%, better than the combination of i-vector and probabilistic linear discriminant analysis.
What problem does this paper attempt to address?