Multilingual Instruction Tuning With Just a Pinch of Multilinguality

Uri Shaham,Jonathan Herzig,Roee Aharoni,Idan Szpektor,Reut Tsarfaty,Matan Eyal
2024-05-21
Abstract:As instruction-tuned large language models (LLMs) gain global adoption, their ability to follow instructions in multiple languages becomes increasingly crucial. In this work, we investigate how multilinguality during instruction tuning of a multilingual LLM affects instruction-following across languages from the pre-training corpus. We first show that many languages transfer some instruction-following capabilities to other languages from even monolingual tuning. Furthermore, we find that only 40 multilingual examples integrated in an English tuning set substantially improve multilingual instruction-following, both in seen and unseen languages during tuning. In general, we observe that models tuned on multilingual mixtures exhibit comparable or superior performance in multiple languages compared to monolingually tuned models, despite training on 10x fewer examples in those languages. Finally, we find that diversifying the instruction tuning set with even just 2-4 languages significantly improves cross-lingual generalization. Our results suggest that building massively multilingual instruction-tuned models can be done with only a very small set of multilingual instruction-responses.
Computation and Language,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to improve the ability of multilingual large - language models (LLMs) to follow instructions in multiple languages. Specifically, the researchers explored the impact of multilingual data when performing instruction - tuning on multilingual LLMs, especially the ability to follow instructions in unseen languages. The key points of the paper include: 1. **Transferability of Multilingual Instruction Tuning**: The study found that even when instruction - tuning is carried out on a single language, it can, to a certain extent, enhance the model's ability to follow instructions in other languages. In particular, when using English, Italian or Spanish for tuning, the best average multilingual performance can be obtained. 2. **Effect of a Small Amount of Multilingual Data**: The researchers found that by simply adding 40 multilingual examples to the English - tuned data set, the ability to follow instructions in these languages can be significantly improved, and at the same time, the performance of languages that were only seen in the pre - training phase but not in the tuned data set can also be improved. 3. **Benefits of Increasing Language Diversity**: By increasing the number of languages in the tuned data set, the cross - language generalization ability can be further improved. Even if only 2 to 4 languages are used to diversify the tuned data set, the cross - language generalization effect can be significantly improved. 4. **Factors Affecting Cross - Language Transfer**: The researchers also explored whether language similarity and the amount of pre - training data would affect the effect of cross - language transfer. The results show that the correlation between language similarity (such as script, mutual understanding) and cross - language transfer is weak; and the correlation between the proportion of data in a specific language in the pre - training data and the cross - language transfer effect is also very weak. Overall, this paper aims to explore how to use limited multilingual data to improve the ability of large - language models to follow instructions in multiple languages, thereby making them more globally applicable.