Abstract:This research work discusses automatic speaker recognition (ASR) using the cepstral characteristics of a speech sample. The enormous majority of efficient speaker recognition systems rely on cepstral learning techniques. We predict that using speech sample pitch frequency will enhance speaker recognition. The proposed speaker recognition framework employs an artificial neural network (ANN) classifier. We examine various transforms domain to extract precise information from the recorded speech signal. We carry out pre-processing interference detection before feature extractions to enhanced the efficiency of ASR. Imitations establish that the regularized pitch frequency (RPF) feature improves the speaker identification system's performance using the discrete cosine transform (DCT), discrete wavelet transform (DWT), and discrete sine transform (DST). We have utilized the vector quantization technique for feature extraction, using Mel-frequency cepstral coefficients (MFCCs), to reduce the quantity of data that needs processing. The novelty of this proposed work is the ANN classification result for speaker identification derived by using DCT, DWT, and DST transforms after applying MFCC feature extraction techniques. In the proposed method, the objective of this research work was to compute the speaker recognition results of 71%, 73%, and 81% by using an ANN classifier with speech samples for features from the DWT, the DWT, the DCT, and the DST, respectively. It is also computed that the speaker recognition rate is 38%, 40%, and 48% by using an ANN classifier without a speech sample for features from DWT, DCT, and DST, respectively. From the result, it is observed that the classification result with a speech sample has a high recognition rate compared to the result without a speech sample.

Speaker identification and localization using shuffled MFCC features and deep learning

Enhancing speaker identification through reverberation modeling and cancelable techniques using ANNs

Development of High Accuracy Classifier for the Speaker Recognition System

Exploiting Speaker Embeddings for Improved Microphone Clustering and Speech Separation in ad-hoc Microphone Arrays

Deep Learning for Speaker Identification: Architectural Insights from AB-1 Corpus Analysis and Performance Evaluation

Speaker Recognition Based on Pre-Trained Model and Deep Clustering

Speaker Identification from emotional and noisy speech data using learned voice segregation and Speech VGG

A Spatial Long-Term Iterative Mask Estimation Approach for Multi-Channel Speaker Diarization and Speech Recognition.

Identification of Speaker from Disguised Voice Using MFCC Feature Extraction, Chi-Square and Classification Technique

Deep Learning Based Stage-wise Two-dimensional Speaker Localization with Large Ad-hoc Microphone Arrays

The HCCL Speaker Verification System for Far-Field Speaker Verification Challenge

Speaker Identification using MFCC-Domain Support Vector Machine

Deep Speaker: an End-to-End Neural Speaker Embedding System

Automatic Speaker Recognition Using Mel-Frequency Cepstral Coefficients Through Machine Learning

ELM speaker identification for limited dataset using multitaper based MFCC and PNCC features with fusion score

Wavelet-Based Mel-Frequency Cepstral Coefficients for Speaker Identification using Hidden Markov Models

Speaker Identification Using MFCC Feature Extraction ANN Classification Technique

Deep Neural Network-Based Bottleneck Feature and Denoising Autoencoder-Based Dereverberation for Distant-Talking Speaker Identification.

Development of Indian Spoken Language Identification System for Two Languages using MFCC Feature with Deep Neural Network

An HMM/MFNN Hybrid Architecture Based on Stacked Generalization for Speaker Identification

A Speaker-Dependent Approach to Single-Channel Joint Speech Separation and Acoustic Modeling Based on Deep Neural Networks for Robust Recognition of Multi-Talker Speech