Abstract:In this thesis, we address the data scarcity and limitations of linguistic theory by proposing language-agnostic multi-task training methods. First, we introduce a meta-learning-based approach, meta-transfer learning, in which information is judiciously extracted from high-resource monolingual speech data to the code-switching domain. The meta-transfer learning quickly adapts the model to the code-switching task from a number of monolingual tasks by learning to learn in a multi-task learning fashion. Second, we propose a novel multilingual meta-embeddings approach to effectively represent code-switching data by acquiring useful knowledge learned in other languages, learning the commonalities of closely related languages and leveraging lexical composition. The method is far more efficient compared to contextualized pre-trained multilingual models. Third, we introduce multi-task learning to integrate syntactic information as a transfer learning strategy to a language model and learn where to code-switch. To further alleviate the aforementioned issues, we propose a data augmentation method using Pointer-Gen, a neural network using a copy mechanism to teach the model the code-switch points from monolingual parallel sentences. We disentangle the need for linguistic theory, and the model captures code-switching points by attending to input words and aligning the parallel words, without requiring any word alignments or constituency parsers. More importantly, the model can be effectively used for languages that are syntactically different, and it outperforms the linguistic theory-based models.

Multi-Modal Transformers Utterance-Level Code-Switching Detection

Transformer-Transducers for Code-Switched Speech Recognition

Unified model for code-switching speech recognition and language identification based on a concatenated tokenizer

Code Switched and Code Mixed Speech Recognition for Indic languages

Transformer-based Model for Word Level Language Identification in Code-mixed Kannada-English Texts

Simple yet Effective Code-Switching Language Identification with Multitask Pre-Training and Transfer Learning

Language Modeling for Code-Switched Data: Challenges and Approaches

Decoupling Pronunciation and Language for End-to-End Code-Switching Automatic Speech Recognition.

Multilingual Transfer Learning for Code-Switched Language and Speech Neural Modeling

Gujarati-English Code-Switching Speech Recognition using ensemble prediction of spoken language

Cascaded Cross-Modal Transformer for Audio-Textual Classification

Multiresolution and Multimodal Speech Recognition with Transformers

Improving Transformer Based End-to-End Code-Switching Speech Recognition Using Language Identification

Adapting the adapters for code-switching in multilingual ASR

Attention-Guided Adaptation for Code-Switching Speech Recognition

Leveraging Pretrained Word Embeddings for Part-of-Speech Tagging of Code Switching Data

CoVoSwitch: Machine Translation of Synthetic Code-Switched Text Based on Intonation Units

Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation

Progressive Sentiment Analysis for Code-Switched Text Data

CTAL: Pre-training Cross-modal Transformer for Audio-and-Language Representations

Multi-modal Speech Transformer Decoders: When Do Multiple Modalities Improve Accuracy?