Speaker Identification Using MFCC Feature Extraction ANN Classification Technique

Mahesh K. Singh
DOI: https://doi.org/10.1007/s11277-024-11282-1
IF: 2.017
2024-06-20
Wireless Personal Communications
Abstract:This research work discusses automatic speaker recognition (ASR) using the cepstral characteristics of a speech sample. The enormous majority of efficient speaker recognition systems rely on cepstral learning techniques. We predict that using speech sample pitch frequency will enhance speaker recognition. The proposed speaker recognition framework employs an artificial neural network (ANN) classifier. We examine various transforms domain to extract precise information from the recorded speech signal. We carry out pre-processing interference detection before feature extractions to enhanced the efficiency of ASR. Imitations establish that the regularized pitch frequency (RPF) feature improves the speaker identification system's performance using the discrete cosine transform (DCT), discrete wavelet transform (DWT), and discrete sine transform (DST). We have utilized the vector quantization technique for feature extraction, using Mel-frequency cepstral coefficients (MFCCs), to reduce the quantity of data that needs processing. The novelty of this proposed work is the ANN classification result for speaker identification derived by using DCT, DWT, and DST transforms after applying MFCC feature extraction techniques. In the proposed method, the objective of this research work was to compute the speaker recognition results of 71%, 73%, and 81% by using an ANN classifier with speech samples for features from the DWT, the DWT, the DCT, and the DST, respectively. It is also computed that the speaker recognition rate is 38%, 40%, and 48% by using an ANN classifier without a speech sample for features from DWT, DCT, and DST, respectively. From the result, it is observed that the classification result with a speech sample has a high recognition rate compared to the result without a speech sample.
telecommunications
What problem does this paper attempt to address?