Domain adaptation based Speaker Recognition on Short Utterances

Ahilan Kanagasundaram,David Dean,Sridha Sridharan,Clinton Fookes
DOI: https://doi.org/10.48550/arXiv.1610.02831
2016-10-10
Sound
Abstract:This paper explores how the in- and out-domain probabilistic linear discriminant analysis (PLDA) speaker verification behave when enrolment and verification lengths are reduced. Experiment studies have found that when full-length utterance is used for evaluation, in-domain PLDA approach shows more than 28% improvement in EER and DCF values over out-domain PLDA approach and when short utterances are used for evaluation, the performance gain of in-domain speaker verification reduces at an increasing rate. Novel modified inter dataset variability (IDV) compensation is used to compensate the mismatch between in- and out-domain data and IDV-compensated out-domain PLDA shows respectively 26% and 14% improvement over out-domain PLDA speaker verification when SWB and NIST data are respectively used for S normalization. When the evaluation utterance length is reduced, the performance gain by IDV also reduces as short utterance evaluation data i-vectors have more variations due to phonetic variations when compared to the dataset mismatch between in- and out-domain data.
What problem does this paper attempt to address?