Fine-tuning Large Language Models for Domain-specific Machine Translation

Jiawei Zheng,Hanghai Hong,Xiaoli Wang,Jingsong Su,Yonggui Liang,Shikai Wu
2024-01-01
Abstract:Large language models (LLMs) have made significant progress in machinetranslation (MT). However, their potential in domain-specific MT remainsunder-explored. Current LLM-based MT systems still face several challenges.First, for LLMs with in-context learning, their effectiveness is highlysensitive to input translation examples, and processing them can increaseinference costs. They often require extra post-processing due toover-generation. Second, LLMs with fine-tuning on domain-specific data oftenrequire high training costs for domain adaptation, and may weaken the zero-shotMT capabilities of LLMs due to over-specialization. The aforementioned methodscan struggle to translate rare words in domain transfer scenarios. To addressthese challenges, this paper proposes a prompt-oriented fine-tuning method,denoted as LlamaIT, to effectively and efficiently fine-tune a general-purposeLLM for domain-specific MT tasks. First, we construct a task-specificmix-domain dataset, which is then used to fine-tune the LLM with LoRA. This caneliminate the need for input translation examples, post-processing, orover-specialization. By zero-shot prompting with instructions, we adapt the MTtasks to the target domain at inference time. To further elicit the MTcapability for rare words, we construct new prompts by incorporatingdomain-specific bilingual vocabulary. We also conduct extensive experiments onboth publicly available and self-constructed datasets. The results show thatour LlamaIT can significantly enhance the domain-specific MT capabilities ofthe LLM, meanwhile preserving its zero-shot MT capabilities.
What problem does this paper attempt to address?