Prompting Large Language Model for Machine Translation: A Case Study

Biao Zhang,Barry Haddow,Alexandra Birch

2023-01-18

Abstract:Research on prompting has shown excellent performance with little or even no supervised training across many tasks. However, prompting for machine translation is still under-explored in the literature. We fill this gap by offering a systematic study on prompting strategies for translation, examining various factors for prompt template and demonstration example selection. We further explore the use of monolingual data and the feasibility of cross-lingual, cross-domain, and sentence-to-document transfer learning in prompting. Extensive experiments with GLM-130B (Zeng et al., 2022) as the testbed show that 1) the number and the quality of prompt examples matter, where using suboptimal examples degenerates translation; 2) several features of prompt examples, such as semantic similarity, show significant Spearman correlation with their prompting performance; yet, none of the correlations are strong enough; 3) using pseudo parallel prompt examples constructed from monolingual data via zero-shot prompting could improve translation; and 4) improved performance is achievable by transferring knowledge from prompt examples selected in other settings. We finally provide an analysis on the model outputs and discuss several problems that prompting still suffers from.

Computation and Language,Machine Learning

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve This paper primarily explores how to utilize large-scale language models (LLM) for machine translation (MT). Specifically, the paper attempts to address the following core issues: 1. **Prompt Strategies**: - What is the most suitable prompt template for machine translation? How do templates perform across different languages? - Do demonstration examples affect the quality of machine translation? How to select the best demonstration examples? 2. **Use of Monolingual Data**: - How to use monolingual data to improve the quality of machine translation? Is it effective to directly use monolingual data as demonstration examples? - Can constructing pseudo-parallel data through back-translation or forward-translation improve translation quality? 3. **Possibility of Transfer Learning**: - Do demonstration examples have transferability under different settings (such as different domains, different language pairs, or different document levels)? - Can cross-domain demonstration examples improve translation performance? Through the above research, the paper aims to fill the existing gap in the literature regarding how to effectively use prompt methods for machine translation and to explore the effects and potential issues of different prompt strategies.

Prompting Large Language Model for Machine Translation: A Case Study

Prompting PaLM for Translation: Assessing Strategies and Performance

Efficient Prompting Methods for Large Language Models: A Survey

Towards Generalist Prompting for Large Language Models by Mental Models

Few-Shot Cross-Lingual Transfer for Prompting Large Language Models in Low-Resource Languages

Cross-Lingual Transfer for Natural Language Inference via Multilingual Prompt Translator

MSP: Multi-Stage Prompting for Making Pre-trained Language Models Better Translators

The language of prompting: What linguistic properties make a prompt successful?

Discrete and Soft Prompting for Multilingual Models

Towards Goal-oriented Prompt Engineering for Large Language Models: A Survey

A Communication Theory Perspective on Prompting Engineering Methods for Large Language Models

Prompt Engineering or Fine Tuning: An Empirical Assessment of Large Language Models in Automated Software Engineering Tasks

Exploring Lottery Prompts for Pre-trained Language Models

Metacognitive Prompting Improves Understanding in Large Language Models

Decomposed Prompting: Unveiling Multilingual Linguistic Structure Knowledge in English-Centric Large Language Models

Bidirectional Language Models Are Also Few-shot Learners

Prompting ChatGPT for Translation: A Comparative Analysis of Translation Brief and Persona Prompts

Visual Prompting in Multimodal Large Language Models: A Survey

RLPrompt: Optimizing Discrete Text Prompts with Reinforcement Learning

Boosted Prompt Ensembles for Large Language Models