Abstract:Instruction tuning has emerged as a powerful technique, significantly boosting zero-shot performance on unseen tasks. While recent work has explored cross-lingual generalization by applying instruction tuning to multilingual models, previous studies have primarily focused on English, with a limited exploration of non-English tasks. For an in-depth exploration of cross-lingual generalization in instruction tuning, we perform instruction tuning individually for two distinct language meta-datasets. Subsequently, we assess the performance on unseen tasks in a language different from the one used for training. To facilitate this investigation, we introduce a novel non-English meta-dataset named "KORANI" (Korean Natural Instruction), comprising 51 Korean benchmarks. Moreover, we design cross-lingual templates to mitigate discrepancies in language and instruction-format of the template between training and inference within the cross-lingual setting. Our experiments reveal consistent improvements through cross-lingual generalization in both English and Korean, outperforming baseline by average scores of 20.7\% and 13.6\%, respectively. Remarkably, these enhancements are comparable to those achieved by monolingual instruction tuning and even surpass them in some tasks. The result underscores the significance of relevant data acquisition across languages over linguistic congruence with unseen tasks during instruction tuning.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the effectiveness of cross - language zero - shot generalization and its application in instruction tuning. Specifically, researchers hope to improve the performance of multilingual models on unseen tasks, especially cross - language tasks between different languages, through instruction tuning techniques. Although previous studies have explored the cross - language generalization ability achieved through instruction tuning, these studies have mainly focused on English, and the exploration of non - English tasks is relatively limited. Therefore, this paper aims to more comprehensively evaluate the effect of cross - language zero - shot generalization and explore its advantages and disadvantages compared with monolingual instruction tuning by introducing new non - English meta - datasets (such as KORANI) and designing cross - language templates. The main contributions of the paper include: 1. **Enhancing the understanding of cross - language instruction tuning**: The research shows that cross - language instruction tuning can achieve performance comparable to or even better than monolingual tuning, emphasizing the importance of cross - language learning - related tasks. 2. **Introducing a new non - English meta - dataset KORANI**: This dataset contains 51 diverse Korean benchmark tests and provides a valuable resource for instruction tuning in non - English languages. 3. **Proposing cross - language templates**: By using cross - language templates in the training and inference stages, the researchers have verified the effectiveness of these templates in improving cross - language zero - shot generalization ability. Through these contributions, the paper not only fills the gaps in existing research in the field of cross - language generalization, but also provides an important reference for future multilingual model development and application.

Deep Exploration of Cross-Lingual Zero-Shot Generalization in Instruction Tuning

MLAN: Language-Based Instruction Tuning Improves Zero-Shot Generalization of Multimodal Large Language Models

CrossIn: An Efficient Instruction Tuning Approach for Cross-Lingual Knowledge Alignment

Zero-shot cross-lingual transfer in instruction tuning of large language models

Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity

Multilingual Instruction Tuning With Just a Pinch of Multilinguality

Investigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions?

X-Instruction: Aligning Language Model in Low-resource Languages with Self-curated Cross-lingual Instructions

Finetuned Language Models Are Zero-Shot Learners

Linguistically-Informed Multilingual Instruction Tuning: Is There an Optimal Set of Languages to Tune?

MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning

Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation

Is It Good Data for Multilingual Instruction Tuning or Just Bad Multilingual Evaluation for Large Language Models?

Learning Instructions with Unlabeled Data for Zero-Shot Cross-Task Generalization

xCoT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning

InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning

How Many Languages Make Good Multilingual Instruction Tuning? A Case Study on BLOOM

KIT-19: A Comprehensive Korean Instruction Toolkit on 19 Tasks for Fine-Tuning Korean Large Language Models

Multilingual Pretraining and Instruction Tuning Improve Cross-Lingual Knowledge Alignment, But Only Shallowly

From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning