Tower: An Open Multilingual Large Language Model for Translation-Related Tasks

Duarte M. Alves,José Pombal,Nuno M. Guerreiro,Pedro H. Martins,João Alves,Amin Farajian,Ben Peters,Ricardo Rei,Patrick Fernandes,Sweta Agrawal,Pierre Colombo,José G.C. de Souza,André F.T. Martins

2024-02-28

Abstract:While general-purpose large language models (LLMs) demonstrate proficiency on multiple tasks within the domain of translation, approaches based on open LLMs are competitive only when specializing on a single task. In this paper, we propose a recipe for tailoring LLMs to multiple tasks present in translation workflows. We perform continued pretraining on a multilingual mixture of monolingual and parallel data, creating TowerBase, followed by finetuning on instructions relevant for translation processes, creating TowerInstruct. Our final model surpasses open alternatives on several tasks relevant to translation workflows and is competitive with general-purpose closed LLMs. To facilitate future research, we release the Tower models, our specialization dataset, an evaluation framework for LLMs focusing on the translation ecosystem, and a collection of model generations, including ours, on our benchmark.

Computation and Language

What problem does this paper attempt to address?

This paper mainly explores how to apply large language models (LLMs) to various translation-related tasks. Currently, although general LLMs perform well on translation tasks, open-source LLMs can only compete with them when they focus on a single task. Researchers propose a method that continuously pretrains on multilingual monolingual and parallel data to create a model called "TOWER BASE," and then fine-tunes it using translation-related instructions to generate the "TOWER INSTRUCT" model. Their final model outperforms open-source alternatives and competes with closed-source models such as GPT-4 and GPT-3.5-turbo on multiple translation-related tasks. Experimental results show that TOWER INSTRUCT performs better than other open-source models in translation quality, automatic post-editing, grammar error correction, and named entity recognition tasks. The researchers also construct a diverse dataset called "TOWER BLOCKS" to facilitate the model's adaptation to translation tasks and conduct extensive evaluations to ensure its performance in a multi-task environment. Additionally, they release the TOWER model, TOWER BLOCKS dataset, evaluation framework, and benchmark test sets to promote future research. In conclusion, this paper addresses the construction of a multilingual LLM that can effectively perform various translation-related tasks, filling the gap in open-source models in this regard.

Tower: An Open Multilingual Large Language Model for Translation-Related Tasks

xTower: A Multilingual LLM for Explaining and Correcting Translation Errors

How Multilingual Are Large Language Models Fine-Tuned for Translation?

The Rise and Down of Babel Tower: Investigating the Evolution Process of Multilingual Code Large Language Model

Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis

Investigating the translation capabilities of Large Language Models trained on parallel data only

Large Language Model for Multi-Domain Translation: Benchmarking and Domain CoT Fine-tuning

A Novel Paradigm Boosting Translation Capabilities of Large Language Models

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

EuroLLM: Multilingual Language Models for Europe

Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners

Tuning Large language model for End-to-end Speech Translation

Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models

TIM: Teaching Large Language Models to Translate with Comparison

Ladder: A Model-Agnostic Framework Boosting LLM-based Machine Translation to the Next Level

How Much Data is Enough Data? Fine-Tuning Large Language Models for In-House Translation: Performance Evaluation Across Multiple Dataset Sizes

Unraveling the Potential of Large Language Models in Code Translation: How Far Are We?

BigTranslate: Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages

A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models

Enhancing Document-level Translation of Large Language Model via Translation Mixed-instructions

Towards Multilingual LLM Evaluation for European Languages