Model-based Counterfactual Generator for Gender Bias Mitigation

Ewoenam Kwaku Tokpo,Toon Calders
2023-11-06
Abstract:Counterfactual Data Augmentation (CDA) has been one of the preferred techniques for mitigating gender bias in natural language models. CDA techniques have mostly employed word substitution based on dictionaries. Although such dictionary-based CDA techniques have been shown to significantly improve the mitigation of gender bias, in this paper, we highlight some limitations of such dictionary-based counterfactual data augmentation techniques, such as susceptibility to ungrammatical compositions, and lack of generalization outside the set of predefined dictionary words. Model-based solutions can alleviate these problems, yet the lack of qualitative parallel training data hinders development in this direction. Therefore, we propose a combination of data processing techniques and a bi-objective training regime to develop a model-based solution for generating counterfactuals to mitigate gender bias. We implemented our proposed solution and performed an empirical evaluation which shows how our model alleviates the shortcomings of dictionary-based solutions.
Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the limitations of existing dictionary - based counterfactual data augmentation (CDA) techniques in alleviating gender bias in natural language models. Specifically, these problems include: 1. **Grammatical incoherence**: Dictionary - based methods are prone to generating sentences with grammatical errors because they usually rely on simple word - replacement rules without considering the context. 2. **Limited generalization ability**: These methods rely heavily on predefined dictionary words and cannot handle new words outside the dictionary, thus limiting their generalization ability. 3. **Lack of high - quality parallel data**: Model - based methods require a large amount of parallel data for training, but such data is relatively scarce, hindering the development of model - based CDA techniques. To overcome these problems, the author proposes a model - based counterfactual data generator (MBCDA), which combines data processing techniques and a dual - objective training strategy to generate higher - quality counterfactual texts and more effectively alleviate gender bias. Specific contributions include: 1. **Data processing pipeline**: Used to generate high - quality parallel data from the output of dictionary - based CDA. 2. **Dual - objective training model**: By introducing a generator and a discriminator, it is ensured that the generated counterfactual texts are not only grammatically correct but also can effectively change the gender association of the original text. Through these innovations, the author hopes to provide a more robust and effective solution to address the deficiencies of existing dictionary - based CDA methods.