Abstract:Sequence labeling models often benefit from incorporating external knowledge. However, this practice introduces data heterogeneity and complicates the model with additional modules, leading to increased expenses for training a high-performing model. To address this challenge, we propose a two-stage curriculum learning (TCL) framework specifically designed for sequence labeling tasks. The TCL framework enhances training by gradually introducing data instances from easy to hard, aiming to improve both performance and training speed. Furthermore, we explore different metrics for assessing the difficulty levels of sequence labeling tasks. Through extensive experimentation on six Chinese word segmentation (CWS) and Part-of-speech tagging (POS) datasets, we demonstrate the effectiveness of our model in enhancing the performance of sequence labeling models. Additionally, our analysis indicates that TCL accelerates training and alleviates the slow training problem associated with complex models.

What problem does this paper attempt to address?

The paper primarily explores how to effectively integrate heterogeneous knowledge and solve the problem of training complexity in sequence labeling tasks. Although existing methods have improved model performance by introducing external knowledge such as n-grams, dictionaries, and syntactic information, they have also increased the heterogeneity of the data and the complexity of the model, leading to increased training time and resource consumption. To address this problem, the paper proposes a two-stage curriculum learning (TCL) framework specifically designed for sequence labeling tasks. In the first stage, data-level curriculum learning, a simple transfer teacher model is trained using all the data to provide initial sample sorting for the student model and help it warm up. In the second stage, model-level curriculum learning, the student model is trained starting from the subset selected by the teacher model, gradually expanding the training subset based on data difficulty and student model status. Additionally, the paper explores various indicators for evaluating the difficulty of sequence labeling tasks, including pre-defined sentence lengths, and Top-N Minimum Confidence (TLC), Maximum Normalized Log Probability (MNLP), and Bayesian Uncertainty (BU) based on model uncertainty. Experimental results demonstrate that the proposed TCL framework can accelerate training speed, improve model performance, and is applicable to other sequence labeling models. Through extensive evaluation on six Chinese part-of-speech tagging and word segmentation datasets, the TCL model demonstrates good performance, particularly on large-scale datasets, aligning with the advantages of curriculum learning in handling data heterogeneity.

An Effective Incorporating Heterogeneous Knowledge Curriculum Learning for Sequence Labeling

Incorporating Deep Syntactic and Semantic Knowledge for Chinese Sequence Labeling with GCN

A Deep Reinforced Sequence-to-Set Model for Multi-Label Classification

Enhancing Label Correlation Feedback in Multi-Label Text Classification via Multi-Task Learning

Coarse-to-Fine Curriculum Learning

Sequence Labeling with Meta-Learning

Joint Segmentation and Tagging with Coupled Sequences Labeling

Learning with Different Amounts of Annotation: From Zero to Many Labels

Co-Learning Meets Stitch-Up for Noisy Multi-label Visual Recognition

Complementary Labels Learning with Augmented Classes

Enhancing Label Sharing Efficiency in Complementary-Label Learning with Label Augmentation

Complementary label learning based on knowledge distillation

Modeling sequential annotations for sequence labeling with crowds

End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF

Curriculum label distribution learning for imbalanced medical image segmentation

SGM: Sequence Generation Model for Multi-label Classification.

Gram-CTC: Automatic Unit Selection and Target Decomposition for Sequence Labelling

Self-adaptive label discovery and multi-view fusion for complementary label learning

Let's Learn Step by Step: Enhancing In-Context Learning Ability with Curriculum Learning

Complementary to Multiple Labels: A Correlation-Aware Correction Approach

Fine-grained Knowledge Fusion for Sequence Labeling Domain Adaptation.