Abstract:As instruction-tuned large language models (LLMs) gain global adoption, their ability to follow instructions in multiple languages becomes increasingly crucial. In this work, we investigate how multilinguality during instruction tuning of a multilingual LLM affects instruction-following across languages from the pre-training corpus. We first show that many languages transfer some instruction-following capabilities to other languages from even monolingual tuning. Furthermore, we find that only 40 multilingual examples integrated in an English tuning set substantially improve multilingual instruction-following, both in seen and unseen languages during tuning. In general, we observe that models tuned on multilingual mixtures exhibit comparable or superior performance in multiple languages compared to monolingually tuned models, despite training on 10x fewer examples in those languages. Finally, we find that diversifying the instruction tuning set with even just 2-4 languages significantly improves cross-lingual generalization. Our results suggest that building massively multilingual instruction-tuned models can be done with only a very small set of multilingual instruction-responses.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to improve the ability of multilingual large - language models (LLMs) to follow instructions in multiple languages. Specifically, the researchers explored the impact of multilingual data when performing instruction - tuning on multilingual LLMs, especially the ability to follow instructions in unseen languages. The key points of the paper include: 1. **Transferability of Multilingual Instruction Tuning**: The study found that even when instruction - tuning is carried out on a single language, it can, to a certain extent, enhance the model's ability to follow instructions in other languages. In particular, when using English, Italian or Spanish for tuning, the best average multilingual performance can be obtained. 2. **Effect of a Small Amount of Multilingual Data**: The researchers found that by simply adding 40 multilingual examples to the English - tuned data set, the ability to follow instructions in these languages can be significantly improved, and at the same time, the performance of languages that were only seen in the pre - training phase but not in the tuned data set can also be improved. 3. **Benefits of Increasing Language Diversity**: By increasing the number of languages in the tuned data set, the cross - language generalization ability can be further improved. Even if only 2 to 4 languages are used to diversify the tuned data set, the cross - language generalization effect can be significantly improved. 4. **Factors Affecting Cross - Language Transfer**: The researchers also explored whether language similarity and the amount of pre - training data would affect the effect of cross - language transfer. The results show that the correlation between language similarity (such as script, mutual understanding) and cross - language transfer is weak; and the correlation between the proportion of data in a specific language in the pre - training data and the cross - language transfer effect is also very weak. Overall, this paper aims to explore how to use limited multilingual data to improve the ability of large - language models to follow instructions in multiple languages, thereby making them more globally applicable.

Multilingual Instruction Tuning With Just a Pinch of Multilinguality

Investigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions?

Linguistically-Informed Multilingual Instruction Tuning: Is There an Optimal Set of Languages to Tune?

How Many Languages Make Good Multilingual Instruction Tuning? A Case Study on BLOOM

Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?

MLAN: Language-Based Instruction Tuning Improves Zero-Shot Generalization of Multimodal Large Language Models

From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning

Maybe Only 0.5 Training Data Instruction Tuning

Demystifying Instruction Mixing for Fine-tuning Large Language Models

Zero-shot cross-lingual transfer in instruction tuning of large language models

Improving Multilingual Instruction Finetuning via Linguistically Natural and Diverse Datasets

Is It Good Data for Multilingual Instruction Tuning or Just Bad Multilingual Evaluation for Large Language Models?

Multimodal Instruction Tuning with Conditional Mixture of LoRA

Instruction Tuning for Large Language Models: A Survey

Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions

Dynamics of Instruction Tuning: Each Ability of Large Language Models Has Its Own Growth Pace

LIMIT: Less Is More for Instruction Tuning Across Evaluation Paradigms

CoMMIT: Coordinated Instruction Tuning for Multimodal Large Language Models

Towards Robust Instruction Tuning on Multimodal Large Language Models

Deep Exploration of Cross-Lingual Zero-Shot Generalization in Instruction Tuning

Instruction Mining: Instruction Data Selection for Tuning Large Language Models