Text Detoxification as Style Transfer in English and Hindi

Sourabrata Mukherjee,Akanksha Bansal,Atul Kr. Ojha,John P. McCrae,Ondřej Dušek

2024-06-10

Abstract:This paper focuses on text detoxification, i.e., automatically converting toxic text into non-toxic text. This task contributes to safer and more respectful online communication and can be considered a Text Style Transfer (TST) task, where the text style changes while its content is preserved. We present three approaches: knowledge transfer from a similar task, multi-task learning approach, combining sequence-to-sequence modeling with various toxicity classification tasks, and delete and reconstruct approach. To support our research, we utilize a dataset provided by Dementieva et al.(2021), which contains multiple versions of detoxified texts corresponding to toxic texts. In our experiments, we selected the best variants through expert human annotators, creating a dataset where each toxic sentence is paired with a single, appropriate detoxified version. Additionally, we introduced a small Hindi parallel dataset, aligning with a part of the English dataset, suitable for evaluation purposes. Our results demonstrate that our approach effectively balances text detoxication while preserving the actual content and maintaining fluency.

Computation and Language

What problem does this paper attempt to address?

The paper aims to address the issue of text detoxification, which involves automatically converting text containing offensive or harmful content into non-offensive, non-harmful text. This task can be seen as part of Text Style Transfer (TST), where the source style is toxic language and the target style is non-toxic language. The goal of the paper is to retain the core content and fluency of the original text during the conversion process, transforming the text from harmful or offensive nature to neutral or positive nature. The authors propose three methods to improve the existing simple sequence-to-sequence training methods: 1. **Knowledge Transfer**: Transfer knowledge from similar tasks. 2. **Multi-task Learning**: Combine sequence-to-sequence modeling with various toxicity classification tasks. 3. **Delete and Reconstruct**: Reconstruct sentences after deleting toxic vocabulary. Additionally, the study utilizes the dataset provided by Dementieva et al. and creates a Hindi dataset containing 500 parallel sentences for validation purposes. Through these methods, the authors hope to improve text detoxification in low-resource settings and promote a safer and more respectful online communication environment.

Text Detoxification as Style Transfer in English and Hindi

Exploring Methods for Cross-lingual Text Style Transfer: The Case of Text Detoxification

Text Detoxification using Large Pre-trained Neural Models

Exploring Cross-lingual Textual Style Transfer with Large Multilingual Language Models

Methods for Detoxification of Texts for the Russian Language

Russian Texts Detoxification with Levenshtein Editing

MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages

Multilingual Text Style Transfer: Datasets & Models for Indian Languages

Multilingual Text Detoxification Using Google Cloud Translation and Post-Processing

Learning from Response not Preference: A Stackelberg Approach for LLM Detoxification using Non-parallel Data

GPT-DETOX: An In-Context Learning-Based Paraphraser for Text Detoxification

DiffuDetox: A Mixed Diffusion Model for Text Detoxification

Detoxifying Text with MaRCo: Controllable Revision with Experts and Anti-Experts

Are Large Language Models Actually Good at Text Style Transfer?

A Survey of Text Style Transfer: Applications and Ethical Implications

Text Style Transfer: An Introductory Overview

Simple Text Detoxification by Identifying a Linear Toxic Subspace in Language Model Embeddings

Mitigating Text Toxicity with Counterfactual Generation

Challenges in Detoxifying Language Models

DetoxLLM: A Framework for Detoxification with Explanations

A Multilingual Text Detoxification Method Based on Few-shot Learning and CO-STAR Framework