Abstract:Digital processing of speech signal and voice recognition algorithm is very important for fast and accurate automatic voice recognition technology. The voice is a signal of infinite information. A direct analysis and synthesizing the complex voice signal is due to too much information contained in the signal. Therefore the digital signal processes such as Feature Extraction and Feature Matching are introduced to represent the voice signal. Several methods such as Liner Predictive Predictive Coding (LPC), Hidden Markov Model (HMM), Artificial Neural Network (ANN) and etc are evaluated with a view to identify a straight forward and effective method for voice signal. The extraction and matching process is implemented right after the Pre Processing or filtering signal is performed. The non-parametric method for modelling the human auditory perception system, Mel Frequency Cepstral Coefficients (MFCCs) are utilize as extraction techniques. The non linear sequence alignment known as Dynamic Time Warping (DTW) introduced by Sakoe Chiba has been used as features matching techniques. Since it's obvious that the voice signal tends to have different temporal rate, the alignment is important to produce the better <a class="link-external link-http" href="http://performance.This" rel="external noopener nofollow">this http URL</a> paper present the viability of MFCC to extract features and DTW to compare the test patterns.

Entropy of Energy Operator As Feature for Large Vocabulary Mandarin Speaker Independent Speech Recognition

Peripheral Nonlinear Time Spectrum Features Algorithm for Large Vocabulary Mandarin Automatic Speech Recognition

Robust speech recognition in noisy backgrounds based on Teager energy operator and auditory process

A Novel and Efficient Voice Activity Detector Using Shape Features of Speech Wave.

Adaptive Compensation Algorithm in Open Vocabulary Mandarin Speaker-Independent Speech Recognition

Time–Frequency Cepstral Features and Heteroscedastic Linear Discriminant Analysis for Language Recognition

Variant Time-Frequency Cepstral Features for Speaker Recognition

Speech Emotion Recognition Based on Syllable-Level Feature Extraction

Real-time Speech Emotion Recognition Based on Syllable-Level Feature Extraction

High Performance Digit Mandarin Speech Recognition

On the Importance of Components of the MFCC in Speech and Speaker Recognition.

Vietnamese Speaker Verification With Mel-Scale Filter Bank Energies and Deep Learning

Robust Speech Detection with Heteroscedastic Discriminant Analysis Applied to the Time-frequency Energy

Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques

A Noise Robust Front End Algorithm for Mandarin Speech Recognition and Performance Analysis

Speaker Recognition Using DMFCC over Telephone Channels

Time-Frequency Cepstral Features and Combining Discriminative Training for Phonotactic Language Recognition

Robust F0 Modeling for Mandarin Speech Recognition in Noise.

Mandarin speech-in-noise and tone recognition using vocoder simulations of the temporal limits encoder for cochlear implants

EigenEmo: Spectral Utterance Representation Using Dynamic Mode Decomposition for Speech Emotion Classification

Time-frequency Network for Robust Speaker Recognition