Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?

Tannon Kew,Florian Schottmann,Rico Sennrich

2024-10-04

Abstract:The vast majority of today's large language models (LLMs) are English-centric, having been pretrained predominantly on English text. Yet, in order to meet user expectations, models need to be able to respond appropriately in multiple languages once deployed in downstream applications. This requires strong cross-lingual transfer abilities. In this work, we investigate the minimal amount of multilinguality required during finetuning to elicit cross-lingual generalisation in English-centric LLMs. In experiments across four LLMs, we find that multilingual instruction tuning with as few as two to three languages is both necessary and sufficient to elicit effective cross-lingual generalisation, with the limiting factor being the degree to which a target language is seen during pretraining. Evaluations on five different tasks further reveal that multilingual instruction tuning is most beneficial for generative tasks that assume input/output language agreement, such as in chat settings, while being of less importance for highly structured classification-style tasks. Our code and data is available at <a class="link-external link-https" href="https://github.com/ZurichNLP/multilingual-instruction-tuning" rel="external noopener nofollow">this https URL</a>.

Computation and Language

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is: how to use the least amount of multilingual data during the fine - tuning process so that English - centric large - scale language models (LLMs) can achieve cross - language generalization. Specifically, the author has studied whether introducing a small amount of multilingual instructions during the fine - tuning process is sufficient for these models to perform well in unseen languages, especially in scenarios where input / output language consistency is assumed in generation tasks. The paper also explores which languages and tasks can benefit the most from this multilingual - instruction tuning. Through experiments, the paper has found that for generation tasks (such as single - turn dialogue, sentence simplification, etc.), only by adding data of two or three languages during the fine - tuning process can the performance of the model in other non - English languages be significantly improved, without the need to fine - tune all potential target languages. In contrast, for structured tasks (such as common - sense reasoning and natural - language reasoning), the effect of multilingual - instruction tuning is not so obvious. This finding indicates that in order to develop chat models capable of handling multiple languages from English - centric LLMs, it is not necessary to prepare fine - tuning data for all possible target languages.

Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?

Investigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions?

Multilingual Instruction Tuning With Just a Pinch of Multilinguality

RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs

Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca

Could We Have Had Better Multilingual LLMs If English Was Not the Central Language?

Linguistically-Informed Multilingual Instruction Tuning: Is There an Optimal Set of Languages to Tune?

How Many Languages Make Good Multilingual Instruction Tuning? A Case Study on BLOOM

Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners

Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions

Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models

How do languages influence each other? Studying cross-lingual data sharing during LM fine-tuning

Understanding and Mitigating Language Confusion in LLMs

Towards Multilingual LLM Evaluation for European Languages

Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models

How Vocabulary Sharing Facilitates Multilingualism in LLaMA?

Empowering Cross-lingual Abilities of Instruction-tuned Large Language Models by Translation-following demonstrations

Zero-shot cross-lingual transfer in instruction tuning of large language models

LLMs Beyond English: Scaling the Multilingual Capability of LLMs with Cross-Lingual Feedback

How Multilingual Are Large Language Models Fine-Tuned for Translation?

LinguaLIFT: An Effective Two-stage Instruction Tuning Framework for Low-Resource Language Tasks