Elastic weight consolidation for better bias inoculation

James Thorne,Andreas Vlachos

DOI: https://doi.org/10.48550/arXiv.2004.14366

2021-02-04

Abstract:The biases present in training datasets have been shown to affect models for sentence pair classification tasks such as natural language inference (NLI) and fact verification. While fine-tuning models on additional data has been used to mitigate them, a common issue is that of catastrophic forgetting of the original training dataset. In this paper, we show that elastic weight consolidation (EWC) allows fine-tuning of models to mitigate biases while being less susceptible to catastrophic forgetting. In our evaluation on fact verification and NLI stress tests, we show that fine-tuning with EWC dominates standard fine-tuning, yielding models with lower levels of forgetting on the original (biased) dataset for equivalent gains in accuracy on the fine-tuning (unbiased) dataset.

Computation and Language,Machine Learning

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that in natural language processing tasks, biases in the training dataset can affect the model, especially in sentence - pair classification tasks such as natural language inference (NLI) and fact verification. Although the impact of these biases can be mitigated by fine - tuning the model on additional data, a common problem is catastrophic forgetting during the fine - tuning process, that is, the model forgets the knowledge of the original training dataset. For this reason, this paper proposes to use the Elastic Weight Consolidation (EWC) technique to fine - tune the model in order to reduce biases and at the same time reduce the risk of catastrophic forgetting. Specifically, the author shows that applying EWC during the fine - tuning process can effectively reduce the forgetting of the original (biased) dataset while maintaining a considerable accuracy improvement on the fine - tuned (unbiased) dataset. The core contribution of the paper lies in proving that EWC can effectively mitigate the problem of catastrophic forgetting of the model during the fine - tuning process without sacrificing the performance of the original task, thereby improving the performance of the model when facing new data or data with different distributions. This is particularly important for natural language processing systems that need continuous learning and adaptation to new environments.

Elastic weight consolidation for better bias inoculation

IncDet: in Defense of Elastic Weight Consolidation for Incremental Object Detection

Elastic Feature Consolidation for Cold Start Exemplar-Free Incremental Learning

EVCL: Elastic Variational Continual Learning with Weight Consolidation

Self-paced Weight Consolidation for Continual Learning

Stubborn Lexical Bias in Data and Models

Unsupervised Pretraining for Neural Machine Translation Using Elastic Weight Consolidation

Towards continuous learning for glioma segmentation with elastic weight consolidation

RTRA: Rapid Training of Regularization-based Approaches in Continual Learning

Reducing Bias in Pre-trained Models by Tuning while Penalizing Change

Towards Understanding Task-agnostic Debiasing Through the Lenses of Intrinsic Bias and Forgetfulness

Inverse Weight-Balancing for Deep Long-Tailed Learning

Catastrophic Forgetting in the Context of Model Updates

OffsetBias: Leveraging Debiased Data for Tuning Evaluators

Downstream bias mitigation is all you need

Gradient Reweighting: Towards Imbalanced Class-Incremental Learning

Task-recency bias strikes back: Adapting covariances in Exemplar-Free Class Incremental Learning

New Insights on Relieving Task-Recency Bias for Online Class Incremental Learning

Enhancing Consistency and Mitigating Bias: A Data Replay Approach for Incremental Learning

AWF: Adaptive Weight Fusion for Enhanced Class Incremental Semantic Segmentation