Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners

Shimao Zhang,Changjiang Gao,Wenhao Zhu,Jiajun Chen,Xin Huang,Xue Han,Junlan Feng,Chao Deng,Shujian Huang
2024-06-19
Abstract:Recently, Large Language Models (LLMs) have shown impressive language capabilities. While most of the existing LLMs have very unbalanced performance across different languages, multilingual alignment based on translation parallel data is an effective method to enhance the LLMs' multilingual capabilities. In this work, we discover and comprehensively investigate the spontaneous multilingual alignment improvement of LLMs. We find that LLMs instruction-tuned on the question translation data (i.e. without annotated answers) are able to encourage the alignment between English and a wide range of languages, even including those unseen during instruction-tuning. Additionally, we utilize different settings and mechanistic interpretability methods to analyze the LLM's performance in the multilingual scenario comprehensively. Our work suggests that LLMs have enormous potential for improving multilingual alignment efficiently with great language and task generalization.
Computation and Language
What problem does this paper attempt to address?
This paper explores the potential of large-scale language models (LLMs) in multilingual learning. Despite significant performance gaps in handling different languages in existing LLMs, the research found that the multilingual alignment methods based on translation parallel data can effectively enhance the multilingual capability of LLMs. Specifically, the paper mentions that fine-tuning the model using only translated data with English and a few other languages with questions can promote multilingual alignment of the model, including unseen languages. In addition, the research utilizes different setups and interpretability techniques to comprehensively analyze the performance of the model in multilingual scenarios, demonstrating that LLMs have an efficient ability to improve multilingual alignment and have good language and task generalization abilities. The results of the paper also show that even with limited training data from target languages, training with a few languages alone can significantly improve multilingual alignment, indicating good language generalization.