Abstract:Pre-trained Language Models are widely used in many important real-world applications. However, recent studies show that these models can encode social biases from large pre-training corpora and even amplify biases in downstream applications. To address this challenge, we propose Co$^2$PT, an efficient and effective debias-while-prompt tuning method for mitigating biases via counterfactual contrastive prompt tuning on downstream tasks. Our experiments conducted on three extrinsic bias benchmarks demonstrate the effectiveness of Co$^2$PT on bias mitigation during the prompt tuning process and its adaptability to existing upstream debiased language models. These findings indicate the strength of Co$^2$PT and provide promising avenues for further enhancement in bias mitigation on downstream tasks.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to mitigate social biases in pre - trained language models (PLMs). Specifically, existing research shows that these pre - trained models will encode unfair social biases when pre - trained on large - scale text corpora, and may amplify these biases in downstream tasks. For example, in the language modeling task, "She is a nurse" may have a higher conditional probability than "He is a nurse"; in the coreference resolution task, the coreference score between "nurse" and "she" may be higher than that of "he". Considering that natural language processing (NLP) applications such as machine translation systems, resume screening systems, dialogue systems and speech recognition systems are widely used by millions of users around the world, it is crucial to mitigate social biases in these models to avoid making discriminatory predictions or offensive outputs for specific groups. To address this challenge, the paper proposes Co2PT (Counterfactual Contrastive Prompt Tuning), which is an efficient and effective method for mitigating biases during prompt tuning. By using counterfactual contrastive prompt tuning in downstream tasks, Co2PT aims to mitigate the biases of the model during prompt tuning and adapt to existing upstream de - biased language models. Experimental results show that Co2PT effectively mitigates biases in three external bias benchmark tests, demonstrating its ability to mitigate biases in downstream tasks and its flexibility to adapt to existing de - biased language models. These findings not only prove the advantages of Co2PT, but also provide a new direction for further enhancing bias mitigation in downstream tasks.

Co$^2$PT: Mitigating Bias in Pre-trained Language Models through Counterfactual Contrastive Prompt Tuning

Prompt Tuning Pushes Farther, Contrastive Learning Pulls Closer: A Two-Stage Approach to Mitigate Social Biases

Mitigating Social Biases of Pre-trained Language Models via Contrastive Self-Debiasing with Double Data Augmentation

Reasoning Beyond Bias: A Study on Counterfactual Prompting and Chain of Thought Reasoning

An Empirical Analysis of Parameter-Efficient Methods for Debiasing Pre-Trained Language Models

Can Instruction Fine-Tuned Language Models Identify Social Bias through Prompting?

Causal-Debias: Unifying Debiasing in Pretrained Language Models and Fine-tuning via Causal Invariant Learning

Prompt-Based Bias Calibration for Better Zero/Few-Shot Learning of Language Models

Reducing Sentiment Bias in Language Models via Counterfactual Evaluation

Biases Mitigation and Expressiveness Preservation in Language Models: A Comprehensive Pipeline (student Abstract)

Towards Understanding Task-agnostic Debiasing Through the Lenses of Intrinsic Bias and Forgetfulness

Causal Prompting: Debiasing Large Language Model Prompting based on Front-Door Adjustment

Debiasing NLU Models via Causal Intervention and Counterfactual Reasoning

DeCoOp: Robust Prompt Tuning with Out-of-Distribution Detection

Walking in Others' Shoes: How Perspective-Taking Guides Large Language Models in Reducing Toxicity and Bias

Can Prompt Probe Pretrained Language Models? Understanding the Invisible Risks from a Causal View

Making Pre-trained Language Models End-to-end Few-shot Learners with Contrastive Prompt Tuning

Promoting Equality in Large Language Models: Identifying and Mitigating the Implicit Bias based on Bayesian Theory

A Contrastive Learning Approach to Mitigate Bias in Speech Models