Abstract:With the notable success of pretrained language models, the pretraining-fine-tuning paradigm has become a dominant solution for natural language understanding (NLU) tasks. Typically, the training instances of a target NLU task are introduced in a completely random order and treated equally at the fine-tuning stage. However, these instances can vary greatly in difficulty, and similar to human learning procedures, language models can benefit from an easy-to-difficult curriculum. Based on this concept, we propose a curriculum learning (CL) framework. Our framework consists of two stages, Review and Arrange, targeting the two main challenges in curriculum learning, i.e., how to define the difficulty of instances and how to arrange a curriculum based on the difficulty, respectively. In the first stage, we devise a cross-review (CR) method to train several teacher models first and then review the training set in a crossed manner to distinguish easy instances from difficult instances. In the second stage, two sampling algorithms, a coarse-grained arrangement (CGA) and a fine-grained arrangement (FGA), are proposed to arrange a curriculum for language models in which the learning materials start from the easiest instances, and more difficult instances are gradually added into the training procedure. Compared to previous heuristic CL methods, our framework can avoid the errors caused by a gap in difficulty between humans and machines and has strong generalization ability. We conduct comprehensive experiments, and the results show that our curriculum learning framework, without any manual model architecture design or use of external data, obtains significant and universal performance improvements on a wide range of NLU tasks in different languages.

Denoising Pre-training for Machine Translation Quality Estimation with Curriculum Learning.

DirectQE: Direct Pretraining for Machine Translation Quality Estimation.

Information Dropping Data Augmentation for Machine Translation Quality Estimation

NJUNLP's Submission for CCMT20 Quality Estimation Task.

Reinforced Curriculum Learning on Pre-trained Neural Machine Translation Models

Improved Pseudo Data for Machine Translation Quality Estimation with Constrained Beam Search

Multi-view fusion for universal translation quality estimation

Self-Supervised Quality Estimation for Machine Translation.

Beyond Glass-Box Features: Uncertainty Quantification Enhanced Quality Estimation for Neural Machine Translation

Curriculum Pre-training for End-to-End Speech Translation

Multilingual Denoising Pre-training for Neural Machine Translation

Unsupervised Quality Estimation for Neural Machine Translation

Dynamic Curriculum Learning for Low-Resource Neural Machine Translation

Review and Arrange: Curriculum Learning for Natural Language Understanding

Knowledge Distillation for Quality Estimation

MBR and QE Finetuning: Training-time Distillation of the Best and Most Expensive Decoding Methods

NJUNLP’s Submission for CCMT 2023 Quality Estimation Task

Practical Perspectives on Quality Estimation for Machine Translation

Ensemble-based Transfer Learning for Low-resource Machine Translation Quality Estimation

An Empirical Exploration of Curriculum Learning for Neural Machine Translation

Curriculum pre-training for stylized neural machine translation