Hate Speech Detection using OpenAI and GPT-3

Sachin Gupta,
DOI: https://doi.org/10.46338/ijetae0522_15
2022-05-01
International Journal of Emerging Technology and Advanced Engineering
Abstract:The toxicity of speech being spewed via cyber media is a critical issue of concern in the cyberverse these days. This problem is associated with using offensive, violent or abusive phrases countering any person, any group or some minority community. Amongst the latest advanced and highlevel language models for natural language processing (NLP) is the generative pre-trained transformer model from OpenAI, code named GPT-3 which could both produce and predict hatred-based text that bullies the specific community or group. With this capability, the concern is anyhow the large language models could be used for analysis and identification of hated speech and also the classification as positive or negative. GPT-3 classifies text as hate or non-hate speech with different learning models including zero to a few shot versions. Amongst these, the zero- and one-shot based learning models achieve an accuracy between 45 to 72 percent, while using few shots learning models, accuracy could be achieved up to 80 percent. The results of this study indicate that in hate and toxic speech detection, large language models play a vital role that also need development so as to counter the toxic content resulting in hatred and can help with safe cyberspace. Keywords— Cybersafety, GPT-3, Language models, NLP, Toxicity in speech
What problem does this paper attempt to address?