Abstract:Articulatory features (AFs) provide language-independent attribute by exploiting the speech production knowledge. This paper proposes a cross-lingual automatic speech recognition (ASR) based on AF methods. Various neural network (NN) architectures are explored to extract cross-lingual AFs and their performance is studied. The architectures include muti-layer perception(MLP), convolutional NN (CNN) and long short-term memory recurrent NN (LSTM). In our cross-lingual setup, only the source language (English, representing a well-resourced language) is used to train the AF extractors. AFs are then generated for the target language (Mandarin, representing an under-resourced language) using the trained extractors. The frame-classification accuracy indicates that the LSTM has an ability to perform a knowledge transfer through the robust cross-lingual AFs from well-resourced to under-resourced language. The final ASR system is built using traditional approaches (e.g. hybrid models), combining AFs with conventional MFCCs. The results demonstrate that the cross-lingual AFs improve the performance in under-resourced ASR task even though the source and target languages come from different language family. Overall, the proposed cross-lingual ASR approach provides slight improvement over the monolingual LF-MMI and cross-lingual (acoustic model adaptation-based) ASR systems.

Multilingual Articulatory Features Augmentation Learning

Phone Recognition for Lhasa-Tibetan Based on Articulatory Features Augmentation Learning

Shared Speech Attribute Augmentation for English-Tibetan Cross-Language Phone Recognition

Articulatory Feature Based Multilingual MLPs for Low-Resource Speech Recognition.

Cross-language speech attribute detection and phone recognition for Tibetan using deep learning

A Chinese Speech Recognition System Based on Articulatory Features

Cross-lingual Automatic Speech Recognition Exploiting Articulatory Features

Improving Minority Language Speech Recognition Based on Distinctive Features

Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints

Exploiting Articulatory Features for Pitch Accent Detection.

Detection-based accented speech recognition using articulatory features.

Accent Recognition with Hybrid Phonetic Features

Application of Articulatory Feature in Uygur and Mandar in Speech Recognition

Exploiting Cross-domain And Cross-Lingual Ultrasound Tongue Imaging Features For Elderly And Dysarthric Speech Recognition

Utilizing auxiliary data in phoneme recognition based on Articulatory Feature

Robust Speech Recognition Combining Cepstral and Articulatory Features

A METHOD TO CONSTRUCT AN ADAPTIVE MONGOLIAN SPEECH ACOUSTIC MODEL

An Audio-Visual Speech Recognition Framework Based on Articulatory Features.

Exploring Pre-trained Speech Model for Articulatory Feature Extraction in Dysarthric Speech Using ASR

Multi-Modal Acoustic-Articulatory Feature Fusion For Dysarthric Speech Recognition

Integrating Articulatory Features into Acoustic-Phonemic Model for Mispronunciation Detection and Diagnosis in L2 English Speech.