Abstract:Natural language understanding (NLU) is the task of semantic decoding of human languages by machines. NLU models rely heavily on large training data to ensure good performance. However, substantial languages and domains have very few data resources and domain experts. It is necessary to overcome the data scarcity challenge, when very few or even zero training samples are available. In this thesis, we focus on developing cross-lingual and cross-domain methods to tackle the low-resource issues. First, we propose to improve the model's cross-lingual ability by focusing on the task-related keywords, enhancing the model's robustness and regularizing the representations. We find that the representations for low-resource languages can be easily and greatly improved by focusing on just the keywords. Second, we present Order-Reduced Modeling methods for the cross-lingual adaptation, and find that modeling partial word orders instead of the whole sequence can improve the robustness of the model against word order differences between languages and task knowledge transfer to low-resource languages. Third, we propose to leverage different levels of domain-related corpora and additional masking of data in the pre-training for the cross-domain adaptation, and discover that more challenging pre-training can better address the domain discrepancy issue in the task knowledge transfer. Finally, we introduce a coarse-to-fine framework, Coach, and a cross-lingual and cross-domain parsing framework, X2Parser. Coach decomposes the representation learning process into a coarse-grained and a fine-grained feature learning, and X2Parser simplifies the hierarchical task structures into flattened ones. We observe that simplifying task structures makes the representation learning more effective for low-resource languages and domains.

Combination Of Data Borrowing Strategies For Low-Resource Lvcsr

AudioVSR: Enhancing Video Speech Recognition with Audio Data

Multi-Stream Posterior Features and Combining Subspace Gmms for Low Resource Lvcsr

State-Level Data Borrowing for Low-Resource Speech Recognition Based on Subspace GMMs.

Cross-Lingual and Ensemble MLPs Strategies for Low-Resource Speech Recognition

Strategies for using MLP based features with limited target-language training data.

Visual Information Assisted Mandarin Large Vocabulary Continuous Speech Recognition

Towards High Performance LVCSR in Speech-to-Speech Translation System on Smart Phones.

Improvement of Acoustic Models Fused with Lip Visual Information for Low-Resource Speech

Linguistic Search Optimization for Deep Learning Based LVCSR

Large-vocabulary Audio-visual Speech Recognition in Noisy Environments

A General Procedure for Improving Language Models in Low-Resource Speech Recognition

Multilingual Meta-Transfer Learning for Low-Resource Speech Recognition

Language-universal phonetic encoder for low-resource speech recognition

LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition

Improving Continuous Sign Language Recognition with Cross-Lingual Signs

LEARNING CROSS-LINGUAL INFORMATION WITH MULTILINGUAL BLSTM FOR SPEECH SYNTHESIS OF LOW-RESOURCE LANGUAGES

Construction of a compact dynamic decoder network for large vocabulary continuous speech recognition

Effective Transfer Learning for Low-Resource Natural Language Understanding

Semi-Supervised Transfer Learning for Language Expansion of End-to-End Speech Recognition Models to Low-Resource Languages

Learning Cross-lingual Mappings for Data Augmentation to Improve Low-Resource Speech Recognition