APAM: Adaptive Pre-training and Adaptive Meta Learning in Language Model for Noisy Labels and Long-tailed Learning

Sunyi Chi,Bo Dong,Yiming Xu,Zhenyu Shi,Zheng Du
DOI: https://doi.org/10.48550/arXiv.2302.03488
2023-05-03
Abstract:Practical natural language processing (NLP) tasks are commonly long-tailed with noisy labels. Those problems challenge the generalization and robustness of complex models such as Deep Neural Networks (DNNs). Some commonly used resampling techniques, such as oversampling or undersampling, could easily lead to overfitting. It is growing popular to learn the data weights leveraging a small amount of metadata. Besides, recent studies have shown the advantages of self-supervised pre-training, particularly to the under-represented data. In this work, we propose a general framework to handle the problem of both long-tail and noisy labels. The model is adapted to the domain of problems in a contrastive learning manner. The re-weighting module is a feed-forward network that learns explicit weighting functions and adapts weights according to metadata. The framework further adapts weights of terms in the loss function through a combination of the polynomial expansion of cross-entropy loss and focal loss. Our extensive experiments show that the proposed framework consistently outperforms baseline methods. Lastly, our sensitive analysis emphasizes the capability of the proposed framework to handle the long-tailed problem and mitigate the negative impact of noisy labels.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the problem of simultaneously handling long - tailed distributions and noisy labels in natural language processing (NLP) tasks. Specifically, the paper points out that in practical applications, the data of NLP tasks usually has the characteristics of long - tailed distributions, and the labels may contain noise. These problems pose challenges to the generalization ability and robustness of complex models such as deep neural networks (DNNs). Existing resampling techniques, such as oversampling or undersampling, are prone to overfitting. In addition, recent research has shown that methods of learning data weights using a small amount of metadata and self - supervised pre - training methods have advantages for handling under - represented data. However, these methods still have limitations in dealing with the problems of noisy labels and long - tailed distributions existing simultaneously. To solve the above problems, the paper proposes a general framework - APAM (Adaptive Pre - training and Adaptive Meta Learning), which aims to deal with the problems of long - tailed distributions and noisy labels simultaneously. This framework adapts to the problem domain through contrastive learning, uses a feed - forward network as a re - weighting module, learns an explicit weighting function according to metadata, and adjusts the weights of each term in the loss function. The framework further adjusts the weights in the loss function by combining the polynomial expansion of cross - entropy loss and focal loss. Experimental results show that the proposed framework significantly outperforms the baseline methods in multiple benchmark tests, and the sensitivity analysis emphasizes the ability of this framework to handle long - tailed problems and mitigate the negative impact of noisy labels.