Abstract:Currently, acoustic spoken language recognition (SLR) and phonotactic SLR systems are widely used language recognition systems. To achieve better performance, researchers combine multiple subsystems with the results often much better than a single SLR system. Phonotactic SLR subsystems may vary in the acoustic features vectors or include multiple language-specific phone recognizers and different acoustic models. These methods achieve good performance but usually compute at high computational cost. In this paper, a new diversification for phonotactic language recognition systems is proposed using vector space models by support vector machine (SVM) supervector reconstruction (SSR). In this architecture, the subsystems share the same feature extraction, decoding, and N-gram counting preprocessing steps, but model in a different vector space by using the SSR algorithm without significant additional computation. We term this a homogeneous ensemble phonotactic language recognition (HEPLR) system. The system integrates three different SVM supervector reconstruction algorithms, including relative SVM supervector reconstruction, functional SVM supervector reconstruction, and perturbing SVM supervector reconstruction. All of the algorithms are incorporated using a linear discriminant analysis-maximum mutual information (LDA-MMI) backend for improving language recognition evaluation (LRE) accuracy. Evaluated on the National Institute of Standards and Technology (NIST) LRE 2009 task, the proposed HEPLR system achieves better performance than a baseline phone recognition-vector space modeling (PR-VSM) system with minimal extra computational cost. The performance of the HEPLR system yields 1.39%, 3.63%, and 14.79% equal error rate (EER), representing 6.06%, 10.15%, and 10.53% relative improvements over the baseline system, respectively, for the 30-, 10-, and 3-s test conditions.

An Efficient Approach to Chinese Phoneme Mouth-Shape Recognition

Mouth-Shape Classification And Recognition For Lipreading

Design and implementation of a speaker recognition system

Lip Reading-Based User Authentication Through Acoustic Sensing on Smartphones.

Mfcc And Svm Based Recognition Of Chinese Vowels

Silent Speech Recognition Based on Surface Electromyography

Feature selection of mime speech recognition using surface electromyography data

High Performance Digit Mandarin Speech Recognition

Texture-Constrained Shape Prediction for Mouth Contour Extraction and its State Estimation

Speech Emotion Recognition Based on Formant Characteristics Feature Extraction and Phoneme Type Convergence.

Preprocessing Improvement in Mime Speech Recognition based on Surface Electromyogram

Sequence Mouth Shape Classification for Speechreading

Unvoiced Speech Recognition Algorithm Based on Myoelectric Signal

A Speech Recognition System Based on a Hybrid HMM/SVM Architecture

Homogenous Ensemble Phonotactic Language Recognition Based on SVM Supervector Reconstruction

A Universal Phoneme-Set Based Language Independent Short Utterance Speaker Recognition

Discriminative Boosting Algorithm for Diversified Front-End Phonotactic Language Recognition

Mouth Movement Prediction Based on Support Vector Regression

Recognition of Sequence Lip Images and Its Application

Chinese Speech Feature Analysis and Recognition Based on Sinusoidal Model

Which phonemes will distinguish the different regions within the same dialect?