Abstract:Practical natural language processing (NLP) tasks are commonly long-tailed with noisy labels. Those problems challenge the generalization and robustness of complex models such as Deep Neural Networks (DNNs). Some commonly used resampling techniques, such as oversampling or undersampling, could easily lead to overfitting. It is growing popular to learn the data weights leveraging a small amount of metadata. Besides, recent studies have shown the advantages of self-supervised pre-training, particularly to the under-represented data. In this work, we propose a general framework to handle the problem of both long-tail and noisy labels. The model is adapted to the domain of problems in a contrastive learning manner. The re-weighting module is a feed-forward network that learns explicit weighting functions and adapts weights according to metadata. The framework further adapts weights of terms in the loss function through a combination of the polynomial expansion of cross-entropy loss and focal loss. Our extensive experiments show that the proposed framework consistently outperforms baseline methods. Lastly, our sensitive analysis emphasizes the capability of the proposed framework to handle the long-tailed problem and mitigate the negative impact of noisy labels.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the problem of simultaneously handling long - tailed distributions and noisy labels in natural language processing (NLP) tasks. Specifically, the paper points out that in practical applications, the data of NLP tasks usually has the characteristics of long - tailed distributions, and the labels may contain noise. These problems pose challenges to the generalization ability and robustness of complex models such as deep neural networks (DNNs). Existing resampling techniques, such as oversampling or undersampling, are prone to overfitting. In addition, recent research has shown that methods of learning data weights using a small amount of metadata and self - supervised pre - training methods have advantages for handling under - represented data. However, these methods still have limitations in dealing with the problems of noisy labels and long - tailed distributions existing simultaneously. To solve the above problems, the paper proposes a general framework - APAM (Adaptive Pre - training and Adaptive Meta Learning), which aims to deal with the problems of long - tailed distributions and noisy labels simultaneously. This framework adapts to the problem domain through contrastive learning, uses a feed - forward network as a re - weighting module, learns an explicit weighting function according to metadata, and adjusts the weights of each term in the loss function. The framework further adjusts the weights in the loss function by combining the polynomial expansion of cross - entropy loss and focal loss. Experimental results show that the proposed framework significantly outperforms the baseline methods in multiple benchmark tests, and the sensitivity analysis emphasizes the ability of this framework to handle long - tailed problems and mitigate the negative impact of noisy labels.

APAM: Adaptive Pre-training and Adaptive Meta Learning in Language Model for Noisy Labels and Long-tailed Learning

Learning from Noisy Labels with Decoupled Meta Label Purifier

Learning with Noisy Labels Via Self-supervised Adversarial Noisy Masking

Hierarchical Noise-Tolerant Meta-Learning With Noisy Labels

Learning from Noisy Labels via Self-Taught On-the-Fly Meta Loss Rescaling

Adaptive Textual Label Noise Learning Based on Pre-trained Models

Adaptive Integration of Partial Label Learning and Negative Learning for Enhanced Noisy Label Learning

Robust Long-Tailed Learning under Label Noise

Meta-Learning the Difference: Preparing Large Language Models for Efficient Adaptation

Meta Label Correction for Noisy Label Learning

Meta Self-training for Few-shot Neural Sequence Labeling

Adaptive Self-training for Few-shot Neural Sequence Labeling

Learning With Noisy Labels Over Imbalanced Subpopulations

Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks

Meta-Learning for Decoding Neural Activity Data with Noisy Labels

Learning from Noisy Labels for Long-tailed Data via Optimal Transport

Active Negative Loss: A Robust Framework for Learning with Noisy Labels

Meta-learning with normalized projection loss reweighting for webly supervised fine-grained recognition

Meta-probability Weighting for Improving Reliability of DNNs to Label Noise

Co-LDL: A Co-Training-Based Label Distribution Learning Method for Tackling Label Noise

LNL+K: Enhancing Learning with Noisy Labels Through Noise Source Knowledge Integration