Deep Nonlinear Metric Learning for Speaker Verification in the I-Vector Space

Yong Feng,Qingyu Xiong,Weiren Shi
DOI: https://doi.org/10.1587/transinf.2016edl8106
2017-01-01
IEICE Transactions on Information and Systems
Abstract:Speaker verification is the task of determining whether two utterances represent the same person. After representing the utterances in the i-vector space, the crucial problem is only how to compute the similarity of two i-vectors. Metric learning has provided a viable solution to this problem. Until now, many metric learning algorithms have been proposed, but they are usually limited to learning a linear transformation. In this paper, we propose a nonlinear metric learning method, which learns an explicit mapping from the original space to an optimal subspace using deep Restricted Boltzmann Machine network. The proposed method is evaluated on the NIST SRE 2008 dataset. Since the proposed method has a deep learning architecture, the evaluation results show superior performance than some state-of-the-art methods.
What problem does this paper attempt to address?