Abstract:Without any explicit cross-lingual training data, multilingual language models can achieve cross-lingual transfer. One common way to improve this transfer is to perform realignment steps before fine-tuning, i.e., to train the model to build similar representations for pairs of words from translated sentences. But such realignment methods were found to not always improve results across languages and tasks, which raises the question of whether aligned representations are truly beneficial for cross-lingual transfer. We provide evidence that alignment is actually significantly correlated with cross-lingual transfer across languages, models and random seeds. We show that fine-tuning can have a significant impact on alignment, depending mainly on the downstream task and the model. Finally, we show that realignment can, in some instances, improve cross-lingual transfer, and we identify conditions in which realignment methods provide significant improvements. Namely, we find that realignment works better on tasks for which alignment is correlated with cross-lingual transfer when generalizing to a distant language and with smaller models, as well as when using a bilingual dictionary rather than FastAlign to extract realignment pairs. For example, for POS-tagging, between English and Arabic, realignment can bring a +15.8 accuracy improvement on distilmBERT, even outperforming XLM-R Large by 1.7. We thus advocate for further research on realignment methods for smaller multilingual models as an alternative to scaling.

Improving In-context Learning of Multilingual Generative Language Models with Cross-lingual Alignment

Iterative Task-adaptive Pretraining for Unsupervised Word Alignment

Cross-Align: Modeling Deep Cross-lingual Interactions for Word Alignment

PreAlign: Boosting Cross-Lingual Transfer by Early Establishment of Multilingual Alignment

MULTI-LEVEL CONTRASTIVE LEARNING FOR CROSS-LINGUAL ALIGNMENT

An Unsupervised Cross-lingual Word Alignment Approach Based on Cycle-GAN and Hybrid Training

Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners

Enhancing Cross-lingual Sentence Embedding for Low-resource Languages with Word Alignment

Improving In-context Learning via Bidirectional Alignment

Multilingual Pretraining and Instruction Tuning Improve Cross-Lingual Knowledge Alignment, But Only Shallowly

Exploring the Relationship between Alignment and Cross-lingual Transfer in Multilingual Transformers

Cross-Lingual Supervision improves Large Language Models Pre-training

Probing the Emergence of Cross-lingual Alignment during LLM Training

Multilingual Sentence Transformer As A Multilingual Word Aligner

How Transliterations Improve Crosslingual Alignment

Improving Pretrained Cross-Lingual Language Models Via Self-Labeled Word Alignment

Emerging cross-lingual structure in pretrained language models

Supervised Contrastive Learning for Cross-Lingual Transfer Learning

CrossIn: An Efficient Instruction Tuning Approach for Cross-Lingual Knowledge Alignment

Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment

Learning Multilingual Representation for Natural Language Understanding with Enhanced Cross-Lingual Supervision