Abstract:The automated identification of toxicity in texts is a crucial area in text analysis since the social media world is replete with unfiltered content that ranges from mildly abusive to downright hateful. Researchers have found an unintended bias and unfairness caused by training datasets, which caused an inaccurate classification of toxic words in context. In this paper, several approaches for locating toxicity in texts are assessed and presented aiming to enhance the overall quality of text classification. General unsupervised methods were used depending on the state-of-art models and external embeddings to improve the accuracy while relieving bias and enhancing F1-score. Suggested approaches used a combination of long short-term memory (LSTM) deep learning model with Glove word embeddings and LSTM with word embeddings generated by the Bidirectional Encoder Representations from Transformers (BERT), respectively. These models were trained and tested on large secondary qualitative data containing a large number of comments classified as toxic or not. Results found that acceptable accuracy of 94% and an F1-score of 0.89 were achieved using LSTM with BERT word embeddings in the binary classification of comments (toxic and nontoxic). A combination of LSTM and BERT performed better than both LSTM unaccompanied and LSTM with Glove word embedding. This paper tries to solve the problem of classifying comments with high accuracy by pertaining models with larger corpora of text (high-quality word embedding) rather than the training data solely.

DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in Spoken Utterances

On the Role of Speech Data in Reducing Toxicity Detection Bias

MuTox: Universal MUltilingual Audio-based TOXicity Dataset and Zero-shot Detector

ToxVidLM: A Multimodal Framework for Toxicity Detection in Code-Mixed Videos

Beyond Toxic: Toxicity Detection Datasets are Not Enough for Brand Safety

ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection

ToVo: Toxicity Taxonomy via Voting

Toxicity Detection can be Sensitive to the Conversational Context

LLM-Based Synthetic Datasets: Applications and Limitations in Toxicity Detection

CONDA: a CONtextual Dual-Annotated dataset for in-game toxicity understanding and detection

Lightweight Toxicity Detection in Spoken Language: A Transformer-based Approach for Edge Devices

Fortifying Toxic Speech Detectors Against Veiled Toxicity

Enhancing Multilingual Voice Toxicity Detection with Speech-Text Alignment

Facilitating Fine-grained Detection of Chinese Toxic Language: Hierarchical Taxonomy, Resources, and Benchmarks

Toxic Language Detection in Social Media for Brazilian Portuguese: New Dataset and Multilingual Analysis

An Automated Toxicity Classification on Social Media Using LSTM and Word Embedding

Toxicity of the Commons: Curating Open-Source Pre-Training Data

Challenges in Detoxifying Language Models

ToxicChat: Unveiling Hidden Challenges of Toxicity Detection in Real-World User-AI Conversation

Modeling subjectivity (by Mimicking Annotator Annotation) in toxic comment identification across diverse communities

Realistic Evaluation of Toxicity in Large Language Models