Abstract:Despite their strong ability to retrieve knowledge in English, current large language models show imbalance abilities in different languages. Two approaches are proposed to address this, i.e., multilingual pretraining and multilingual instruction tuning. However, whether and how do such methods contribute to the cross-lingual knowledge alignment inside the models is unknown. In this paper, we propose CLiKA, a systematic framework to assess the cross-lingual knowledge alignment of LLMs in the Performance, Consistency and Conductivity levels, and explored the effect of multilingual pretraining and instruction tuning on the degree of alignment. Results show that: while both multilingual pretraining and instruction tuning are beneficial for cross-lingual knowledge alignment, the training strategy needs to be carefully designed. Namely, continued pretraining improves the alignment of the target language at the cost of other languages, while mixed pretraining affect other languages less. Also, the overall cross-lingual knowledge alignment, especially in the conductivity level, is unsatisfactory for all tested LLMs, and neither multilingual pretraining nor instruction tuning can substantially improve the cross-lingual knowledge conductivity.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the unbalanced knowledge alignment ability among different languages in current multilingual large - scale models. Although these models perform excellently in English tasks, their performance in non - English tasks is relatively poor. Specifically, the paper focuses on improving the model's ability to align knowledge among different languages through multilingual pre - training and instruction tuning, especially cross - language knowledge transfer (that is, whether the knowledge learned in one language can be effectively retrieved in another language). However, it is still unclear whether and how these two methods affect the cross - language knowledge alignment mechanism within the model. To evaluate the impact of multilingual pre - training and instruction tuning on cross - language knowledge alignment, the author proposes a systematic framework CLiKA to measure the degree of cross - language knowledge alignment from three aspects: Performance, Consistency and Conductivity. The research results show that: 1. **Multilingual pre - training and instruction tuning are beneficial to cross - language knowledge alignment, but the effect is limited**: - Continuing pre - training can improve the knowledge alignment degree of the target language, but at the cost of sacrificing the performance of other languages. - Mixed pre - training can greatly improve the basic ability and knowledge performance of multiple languages and has less impact on other languages. - However, neither continuing pre - training nor mixed pre - training can significantly improve the conductivity of cross - language knowledge. 2. **The cross - language knowledge conductivity is generally low**: - Even after multilingual pre - training and instruction tuning, all the large - scale models tested still perform poorly in cross - language knowledge transfer, especially at the conductivity level. 3. **Differences in the effects of different training strategies**: - Mixed pre - training is more effective in improving the performance and consistency of multiple languages. - Continuing pre - training can improve the performance of the target language, but may damage the performance of other languages, and has limited improvement on the consistency and conductivity of cross - language knowledge. In general, this paper aims to evaluate the impact of multilingual pre - training and instruction tuning on cross - language knowledge alignment and proposes a systematic evaluation framework CLiKA, hoping to provide references for the future optimization of multilingual large - scale models.

Multilingual Pretraining and Instruction Tuning Improve Cross-Lingual Knowledge Alignment, But Only Shallowly

Iterative Task-adaptive Pretraining for Unsupervised Word Alignment

CrossIn: An Efficient Instruction Tuning Approach for Cross-Lingual Knowledge Alignment

Investigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions?

PreAlign: Boosting Cross-Lingual Transfer by Early Establishment of Multilingual Alignment

Improving In-context Learning of Multilingual Generative Language Models with Cross-lingual Alignment

Multilingual Instruction Tuning With Just a Pinch of Multilinguality

InstructAlign: High-and-Low Resource Language Alignment via Continual Crosslingual Instruction Tuning

Is It Good Data for Multilingual Instruction Tuning or Just Bad Multilingual Evaluation for Large Language Models?

Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions

How Many Languages Make Good Multilingual Instruction Tuning? A Case Study on BLOOM

How Transliterations Improve Crosslingual Alignment

S4-Tuning: A Simple Cross-lingual Sub-network Tuning Method-Tuning: A Simple Cross-lingual Sub-network Tuning Method

Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners

Exploring the Relationship between Alignment and Cross-lingual Transfer in Multilingual Transformers

Linguistically-Informed Multilingual Instruction Tuning: Is There an Optimal Set of Languages to Tune?

Instruction-tuning Aligns LLMs to the Human Brain

Empowering Cross-lingual Abilities of Instruction-tuned Large Language Models by Translation-following demonstrations

Question Translation Training for Better Multilingual Reasoning

Multilingual Translation with Extensible Multilingual Pretraining and Finetuning

Alignment at Pre-training! Towards Native Alignment for Arabic LLMs