Cross-Lingual Speaker Verification Based On Linear Transform

Rozi Askar,Dong Wang,Fanhu Bie,Jun Wang,Thomas Fang Zheng
DOI: https://doi.org/10.1109/ChinaSIP.2015.7230457
2015-01-01
Abstract:Speaker verification suffers from serious performance degradation if the enrollment and test speech are in different languages. This degradation can be largely attributed to the different distributions of acoustic features in different languages. This paper proposes a linear transform approach which projects speech signals from its own language to another language so that the language mismatch between enrollment and test can be mitigated. The constrained maximum likelihood linear regression (CMLLR) is adopted to conduct the linear transform in the feature domain.The proposed approach has been evaluated on a Chinese-Uyghur cross-lingual speaker verification task. We collected a bilingual speech database CSLT-CUDGT2014 which consists of 113 female speakers who can speak both Standard Chinese and Uyghur. Based on this database and with the proposed linear transform, a relative improvement about 10% in the equal error rate (EER) was achieved.
What problem does this paper attempt to address?