Abstract:Transformer-based language models have achieved remarkable success in few-shot in-context learning and drawn a lot of research interest. However, these models' performance greatly depends on the choice of the example prompts and also has high variability depending on how samples are chosen. In this paper, we conduct a comprehensive study of retrieving semantically similar few-shot samples and using them as the context, as it helps the model decide the correct label without any gradient update in the multilingual and cross-lingual settings. We evaluate the proposed method on five natural language understanding datasets related to intent detection, question classification, sentiment analysis, and topic classification. The proposed method consistently outperforms random sampling in monolingual and cross-lingual tasks in non-English languages.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to retrieve few - shot samples through semantic similarity in multilingual and cross - language environments, so as to improve the performance of Transformer - based language models in few - shot learning tasks. Specifically, the paper focuses on the following aspects: 1. **The influence of selected example prompts**: Existing research shows that the performance of Transformer language models in few - shot learning largely depends on the quality of the selected example prompts. Different selections of example prompts can lead to significant differences in model performance. 2. **Comparison between random sampling and semantic - similarity - based sampling**: The paper experimentally compares the effects of random sampling and semantic - similarity - based sampling methods in multilingual and cross - language tasks. The study finds that using semantically similar samples as context can more effectively help the model make correct predictions without the need for gradient updates. 3. **Challenges in cross - language tasks**: In non - English and cross - language tasks, how to select appropriate samples as context is a problem that has not been fully explored. The paper experimentally verifies the effectiveness of the semantic - similarity - based sampling strategy in these tasks. 4. **Stability of model performance**: The paper also explores the influence of different sampling strategies on the stability of model performance. The results show that when using samples that are more semantically similar to the query samples, the model's performance is more stable; conversely, using dissimilar samples will lead to performance degradation. In summary, the main objective of this paper is to systematically study and experiment to verify whether using the semantic - similarity - based sampling strategy can effectively improve the performance of few - shot learning in multilingual and cross - language tasks, and provide specific empirical support.

Multilingual Few-Shot Learning via Language Model Retrieval

Language Models are Few-shot Multilingual Learners

Few-shot Learning with Multilingual Language Models

Few-shot learning with multilingual generative language models

When Low Resource NLP Meets Unsupervised Language Model: Meta-Pretraining then Meta-Learning for Few-Shot Text Classification (Student Abstract)

Few-shot learning for remote sensing image retrieval with maml

Improving Few-shot Text Classification via Pretrained Language Representations

True Few-Shot Learning with Language Models

In-context Learning with Retrieved Demonstrations for Language Models: A Survey

Multimodal Few-Shot Learning with Frozen Language Models

The unreasonable effectiveness of few-shot learning for machine translation

A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters

Language Models as Few-Shot Learner for Task-Oriented Dialogue Systems

Enhancing Code Translation in Language Models with Few-Shot Learning via Retrieval-Augmented Generation

Making Pre-trained Language Models Better Learn Few-Shot Spoken Language Understanding in More Practical Scenarios.

Analyzing and Adapting Large Language Models for Few-Shot Multilingual NLU: Are We There Yet?

Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning with Multimodal Models

Language Models are Few-Shot Learners. 2020. doi: 10.48550

TransMed: Large Language Models Enhance Vision Transformer for Biomedical Image Classification

Meta Learning to Bridge Vision and Language Models for Multimodal Few-Shot Learning