Multilingual Few-Shot Learning via Language Model Retrieval

Genta Indra Winata,Liang-Kang Huang,Soumya Vadlamannati,Yash Chandarana
2023-06-19
Abstract:Transformer-based language models have achieved remarkable success in few-shot in-context learning and drawn a lot of research interest. However, these models' performance greatly depends on the choice of the example prompts and also has high variability depending on how samples are chosen. In this paper, we conduct a comprehensive study of retrieving semantically similar few-shot samples and using them as the context, as it helps the model decide the correct label without any gradient update in the multilingual and cross-lingual settings. We evaluate the proposed method on five natural language understanding datasets related to intent detection, question classification, sentiment analysis, and topic classification. The proposed method consistently outperforms random sampling in monolingual and cross-lingual tasks in non-English languages.
Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to retrieve few - shot samples through semantic similarity in multilingual and cross - language environments, so as to improve the performance of Transformer - based language models in few - shot learning tasks. Specifically, the paper focuses on the following aspects: 1. **The influence of selected example prompts**: Existing research shows that the performance of Transformer language models in few - shot learning largely depends on the quality of the selected example prompts. Different selections of example prompts can lead to significant differences in model performance. 2. **Comparison between random sampling and semantic - similarity - based sampling**: The paper experimentally compares the effects of random sampling and semantic - similarity - based sampling methods in multilingual and cross - language tasks. The study finds that using semantically similar samples as context can more effectively help the model make correct predictions without the need for gradient updates. 3. **Challenges in cross - language tasks**: In non - English and cross - language tasks, how to select appropriate samples as context is a problem that has not been fully explored. The paper experimentally verifies the effectiveness of the semantic - similarity - based sampling strategy in these tasks. 4. **Stability of model performance**: The paper also explores the influence of different sampling strategies on the stability of model performance. The results show that when using samples that are more semantically similar to the query samples, the model's performance is more stable; conversely, using dissimilar samples will lead to performance degradation. In summary, the main objective of this paper is to systematically study and experiment to verify whether using the semantic - similarity - based sampling strategy can effectively improve the performance of few - shot learning in multilingual and cross - language tasks, and provide specific empirical support.