Pseudo-pitch-synchronized Phase Information Extraction and Its Application for Robust Speaker Recognition

Longbiao Wang,Seiichi Nakagawa,Jianwu Dang,Jianguo Wei,Tongtong Shen,Lantian Li,Thomas Fang Zheng
DOI: https://doi.org/10.1109/gcce.2017.8229401
2017-01-01
Abstract:Recent studies have shown that phase information contains speaker-dependent characteristics and is effective for speaker recognition. In this paper, we summarize a robust phase feature extracted from Fourier spectrum (including pitch non-synchronized phase information and pseudo-pitchsynchronized phase information) and its application for speaker recognition for different speaking rate speech and noisy speech, and add the evaluation of speaker identification for short duration speech of training dataset and test set and speaker verification for telephone speech with channel variability. For pseudopitch-synchronized phase information extraction, the maximum amplitude of each frame is adopted as the center of the next window. Experiments were conducted using the Japanese Newspaper Article Sentence (JNAS) database, NTT database and NIST SRE 2003 database. The pseudo-pitch-synchronized phase information significantly outperformed than our proposed conventional pitch non-synchronized phase information for all cases. By combining the proposed phase information with MFCC, the speaker recognition performance was remarkably improved than that of MFCC.
What problem does this paper attempt to address?