Multilingual Text Detoxification Using Google Cloud Translation and Post-Processing

Aiguo Wang,Man Luo,Zhongyu Luo
Abstract:The task of text detoxification aims to re-write toxic text into non-toxic text. Though existing methods have achieved impressive detoxification performance in monolingual settings, multilingual text detoxification remains challenging due to the complexity of natural languages and the lack of sufficient data for training accurate models for minor languages. In this study, we propose a cross-lingual text detoxification model, named GCTP, utilizing Google Cloud Translation and post-processing for the PAN@CLEF 2024 multilingual text detoxification task. Specifically, GCTP first translates minor language text into English for detoxification with a pretrained English model, and then translates it back to remove toxic keywords with predefined dictionaries of the original language. Extensive comparative experiments on competition datasets show that GCTP achieves the highest J score for Amharic text and ranks the 5 th place for Chinese text, demonstrating the effectiveness of GCTP.
Linguistics,Computer Science
What problem does this paper attempt to address?