Abstract:In current scenario, speaker recognition under noisy condition is the major challenging task in the area of speech processing. Due to noise environment there is a significant degradation in the system performance. The major aim of the proposed work is to identify the speaker's under clean and noise background using limited dataset. In this paper, we proposed a multitaper based Mel frequency cepstral coefficients (MFCC) and power normalization cepstral coefficients (PNCC) techniques with fusion strategies. Here, we used MFCC and PNCC techniques with different multitapers to extract the desired features from the obtained speech samples. Then, cepstral mean and variance normalization (CMVN) and Feature warping (FW) are the two techniques applied to normalize the obtained features from both the techniques. Furthermore, as a system model low dimension i-vector model is used and also different fusion score strategies like mean, maximum, weighted sum, cumulative and concatenated fusion techniques are utilized. Finally extreme learning machine (ELM) is used for classification in order to increase the system identification accuracy (SIA) intern which is having a single layer feedforward neural network with less complexity and time consuming compared to other neural networks. TIMIT and SITW 2016 are the two different databases are used to evaluate the proposed system under limited data of these databases. Both clean and noisy backgrounds conditions are used to check the SIA.

Auditory Model Based Speech Feature Extraction and Its Application to Speaker Identification

Auditory model-based speech feature extraction and its application to speaker identification

Emotional speaker recognition based on similar neighbor phenomenon

An Auditory Feature Extraction Method Based on Forward-Masking and Its Application in Robust Speaker Identification and Speech Recognition.

Speech Personality Recognition Based on Annotation Classification Using Log-Likelihood Distance and Extraction of Essential Audio Features.

A Forward Masking Auditory Model And Its Application In Speaker Identification And Speech Recognition

Bionic Cepstral coefficients (BCC): A new auditory feature extraction to noise-robust speaker identification

Self-attention Based Speaker Recognition Using Cluster-Range Loss

An Interpretable and Generalizable Speech Detector Based on a CNN-LSTM Framework

Application of a New Mixed Feature in Speaker Identification

Text-independent Speaker Identification Based on Spectral Weighting Functions

An Auditory-Based Monaural Feature for Noisy and Reverberant Speech Enhancement

Modeling of Three Types of Auditory Nerve and Its Application in Speech Recognition.

Further Feature Extraction and Its Application on Speaker Recognition

Short Utterance Speaker Recognition Based on Speech High Frequency Information Compensation and Dynamic Feature Enhancement Methods

Speech Feature Extraction in Broadcast Hosting Based on Fluctuating Equation Inversion

A novel hybrid feature method based on Caelen auditory model and gammatone filterbank for robust speaker recognition under noisy environment and speech coding distortion

ELM speaker identification for limited dataset using multitaper based MFCC and PNCC features with fusion score

Experimental evaluation of a new speaker identification framework using PCA.

Comparison of feature extraction and normalization methods for speaker recognition using grid-audiovisual database

Auditory Features For The Close Talk Speech Enhancement With Parameter Masks