Abstract:While NMT has achieved remarkable results in the last 5 years, production systems come with strict quality requirements in arbitrarily niche domains that are not always adequately covered by readily available parallel corpora. This is typically addressed by training domain specific models, using fine-tuning methods and some variation of back-translation on top of in-domain monolingual corpora. However, industrial practitioners can rarely afford to focus on a single domain. A far more typical scenario includes a set of closely related, yet succinctly different sub-domains. At <a class="link-external link-http" href="http://Booking.com" rel="external noopener nofollow">this http URL</a>, we need to translate property descriptions, user reviews, as well as messages, (for example those sent between a customer and an agent or property manager). An editor might need to translate articles across a set of different topics. An e-commerce platform would typically need to translate both the description of each item and the user generated content related to them. To this end, we propose MDT: a novel method to simultaneously fine-tune on several sub-domains by passing multidimensional sentence-level information to the model during training and inference. We show that MDT achieves results competitive to N specialist models each fine-tuned on a single constituent domain, while effectively serving all N sub-domains, therefore cutting development and maintenance costs by the same factor. Besides BLEU (industry standard automatic evaluation metric known to only weakly correlate with human judgement) we also report rigorous human evaluation results for all models and sub-domains as well as specific examples that better contextualise the performance of each model in terms of adequacy and fluency. To facilitate further research, we plan to make the code available upon acceptance.

A Technique to Pre-trained Neural Network Language Model Customization to Software Development Domain

Adapt-and-Distill: Developing Small, Fast and Effective Pretrained Language Models for Domains.

Neural language models for network configuration: Opportunities and reality check

Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

HyperLoRA: Efficient Cross-task Generalization Via Constrained Low-Rank Adapters Generation

A Compact Pretraining Approach for Neural Language Models

Pre-Trained Language Models and Their Applications

Subword language modeling with neural networks

PreConfig: A Pretrained Model for Automating Network Configuration

Need a Small Specialized Language Model? Plan Early!

Multilingual training for Software Engineering

Domain-specific machine translation with recurrent neural network for software localization

Training Neural Response Selection for Task-Oriented Dialogue Systems

Pruning-then-Expanding Model for Domain Adaptation of Neural Machine Translation

Neural personalized response generation as domain adaptation

Better Language Models of Code through Self-Improvement

Multi-Domain Adaptation in Neural Machine Translation Through Multidimensional Tagging

Fine-tuning large neural language models for biomedical natural language processing

On the Domain Adaptation and Generalization of Pretrained Language Models: A Survey

Recurrent Neural Network Based Language Model Personalization by Social Network Crowdsourcing

An overview of domain-specific foundation model: key technologies, applications and challenges