Abstract:Speaker identification is the method of human voice identifying with the help of artificial intelligence (AI) method. The technology of speaker identification is broadly utilized in voice recognition, secure, surveillance, electronic voice eavesdropping, and the verification of identity. In the existing methods, it does not provide the sufficient accuracy and robustness of the speech signal. To overcome these issues, an efficient Speaker Identification framework based on Mask region based convolutional neural network (Mask R-CNN) classifier parameter optimized using Hosted Cuckoo Optimization (HCO) is proposed in this manuscript. The objective of the proposed method is "to increase the accuracy and to improve the robustness of the signal". Initially, the input speech signals are taken from the real time dataset. From the input speech signal, there are four types of the features are extracted, they are Mel Frequency Differential Power Cepstral Coefficients (MFDPCC), Gamma tone Frequency Cepstral Coefficients (GFCC), Power Normalized Cepstral Coefficients (PNCC) and Spectral entropy for improving the robustness of the signal. Then, the speaker ID is classified by using the Mask R-CNN classifier. Similarly, the Mask R-CNN classifier parameters are optimized by using the HCO algorithm. This method is relevant in the real time application, such as telephone banking and the fax mailing. The simulation is executed in MATLAB. The simulation results shows that the proposed Mask-R-CNN-HCO method attains accuracy of 24.16%, 32.18%, 28.43%, 36.4%, 33.26%, Sensitivity of 37.68%, 33.80%, 24.16%, 32.18%, 28.43%, Precision of 35.88%, 24.16%, 32.18%, 28.43%, 26.77% higher than the existing methods, such as Automatic Classification of speaker identification using K-Nearest Neighbors algorithm (KNN), classification of speaker identification using multiclass support vector machine(MCSVM), classification of speaker identification using Gaussian Mixture Model–Convolutional Neural Network (GMMCNN) classifier, classification of speaker identification using Deep neural network (DNN) and classification of speaker identification using Gaussian Mixture Model–deep Neural Network (GMMDNN) classifier.

Speaker Identification System Based on Hybrid Neural Network

An Hmm/Mfnn Hybrid Architecture Based On Stacked Generalization For Speaker Identification

Emotional Speaker Identification By Humans And Machines

Emotional speaker recognition based on similar neighbor phenomenon

Real-time Speaker Recognition System for PDA

Speech Recognition Algorithm Based on Neural Network and Hidden Markov Model

Combined GMM-UBM and SVM Speaker Identification System

Speaker Identification based on LSP and Gaussian Mixture Model

Speaker Recognition Based on SOINN and Incremental Learning Gaussian Mixture Model

Speaker Identification Using a Reference Speaker Model Based a Two-Layer Structure

An efficient speaker identification framework based on Mask R-CNN classifier parameter optimized using hosted cuckoo optimization (HCO)

Development of High Accuracy Classifier for the Speaker Recognition System

Experimental evaluation of a new speaker identification framework using PCA.

Speech Recognition System Based on CDHMM/SOFMNN in Noisy Environment

Speech Recognition System Based on SCHMM/ANN in Noisy Environment

Speaker recognition using Improved Butterfly Optimization Algorithm with hybrid Long Short Term Memory network

Using Subband Mel-spectrum Centroid and Gaussian Mixture Correlation for Robust Speaker Identification

Speaker Identification Using Fusion of RBF Networks and Fisher Discriminant

An algorithm for efficient speaker identification using reference speaker model based two-layer structure

Speaker Recognition System Based on Deep Neural Networks and Bottleneck Features