Abstract:Currently, the majority of the state-of-the-art speaker recognition systems predominantly use short-term cepstral feature extraction approaches to parameterize the speech signals. In this paper, we propose new auditory features based Caelen auditory model that simulate the external, middle and inner parts of the ear and Gammtone filter for speaker recognition system, called Caelen Auditory Model Gammatone Cepstral Coefficients (CAMGTCC). The performances evaluations of the proposed feature are carried by the TIMIT and NIST 2008 corpus. The speech coding represent by Adaptive Multi-Rate wideband (AMR-WB) and noisy conditions using various noises SNR levels which are extracted from NOISEX-92. Speaker recognition system using GMM-UBM and i-vector-GPLDA modelling. The experimental results demonstrate that the proposed feature extraction method performs better compared to the Gammatone Cepstral Coefficients (GTCC) and Mel Frequency Cepstral Coefficients (MFCC) features. For speech coding distortion, the features extraction proposed improve the robustness of codec-degraded speech at different bit rates. In addition, when the test speech signals are corrupted with noise at SNRs ranging from (0 dB to 15 dB), we observe that CAMGTCC achieves overall equal error rate (EER) reduction of 10.88% to 6.8% relative, compared to baselines.

Speaker Normalization Based on the Generalized Time-Frequency Representation and Mellin Transform

A Novel Speaker Normalization Method Based on Formant Recovery and Mellin Transform

SPEAKER NORMALIZATION AND NOVEL ROBUST SPEECH FEATURE BASED ON MELLIN TRANSFORM

A Novel Robust Feature Of Speech Signal Based On The Mellin Transform For Speaker-Independent Speech Recognition

A new speech feature insensitive to the variation of different speakers

Variant Time-Frequency Cepstral Features for Speaker Recognition

Multi-resolution Time Frequency Feature and Complementary Combination for Short Utterance Speaker Recognition

A perceptually-motivated low-complexity instantaneous linear channel normalization technique applied to speaker verification

Time-frequency Network for Robust Speaker Recognition

Acoustic Feature Extraction Method for Robust Speaker Identification

Double Gaussian Based Feature Normalization For Robust Speech Recognition

Non-negative matrix factorization based discriminative features for speaker verification

Auditory Features with Vocal Track Length Normalization for Language Identification

Speech recognition using Hilbert-Huang transform based features

Short Time Speaker Recognition Method Based on Common Feature Selection

Improvement of MFCC parameters extraction in speaker recognition

A novel hybrid feature method based on Caelen auditory model and gammatone filterbank for robust speaker recognition under noisy environment and speech coding distortion

Extracting the Features of Speaker Verification Based on Fractional Cosine and Sine Transform

Deep Normalization for Speaker Vectors

Robust Speaker Identification Using An Auditory-Based Feature

Study to Speaker Recognition Using RVM