Co-Whitening Of I-Vectors For Short And Long Duration Speaker Verification

Longting Xu,Kong-Aik Lee,Haizhou Li,Zhen Yang
DOI: https://doi.org/10.21437/Interspeech.2018-1246
2018-01-01
Abstract:An i-vector is a fixed-length and low-rank representation of a speech utterance. It has been used extensively in text independent speaker verification. Ideally, speech utterances from the same speaker would map to an unique i-vector. However, this is not the case due to some intrinsic and extrinsic factors like physical condition of the speaker, channel difference, noise and notably the duration of speech utterances. In particular, we found that i-vectors extracted from short utterances exhibit larger variance than that of long utterances. To address the problem, we propose a co-whitening approach, taking into account the duration, while maximizing the correlation between the i-vectors of short and long duration. The proposed co-whitening method was derived based on canonical correlation analysis (CCA). Experimental results on NIST SRE 2010 show that co-whitening method is effective in compensating the duration mismatch, leading to a reduction of up to 13.07% in equal error rate (EER).
What problem does this paper attempt to address?