Pitch Synchronized Relative Phase with Peak Error Detection For Noise-robust Speaker Recognition

Meng Ge,Longbiao Wang,Seiichi Nakagawa,Yuta Kawakami,Jianwu Dang,Xiangang Li
DOI: https://doi.org/10.1109/ISCSLP.2018.8706701
2018-01-01
Abstract:In conventional speaker identification methods based on mel-frequency cepstral coefficients (MFCCs), phase information is ignored. Recent studies have shown that phase information contains speaker dependent characteristics, and it is effective for speaker recognition. In this paper, we propose a pitch synchronized relative phase information for speaker identification in noisy environments. To mitigate the affect of noise on pseudo pitch synchronized relative phase information extraction, a peak error detection using an autocorrelation based algorithm was proposed. Experiments were conducted using the JNAS (Japanese Newspaper Article Sentence) database. The pitch synchronized relative phase information with peak error detection based method achieved a relative speaker identification error reduction rate of 23.9% compared to the conventional phase information (that is pitch non-synchronized relative phase). By combining the proposed method with MFCC, the speaker identification rate was improved from 55.0% (MFCC) to 76.9%.
What problem does this paper attempt to address?